library(dplyr)
library(tidyr)
library(ggplot2)
library(ggfortify)
## theme for ggplot
theme_set(theme_classic())
theme_update(text = element_text(size = 14))R
Day 7
Freie Universität Berlin @ Theoretical Ecology
January 19, 2024
library(dplyr)
library(tidyr)
library(ggplot2)
library(ggfortify)
## theme for ggplot
theme_set(theme_classic())
theme_update(text = element_text(size = 14))We want to analyse whether the species Quercus robur and Fagus sylvatica differ in their response to warming in spring.
We designed the following experiment:
\[ GDD = \sum_{t=t_0}^{t_1} T_{mean, t} - T_{base} \]
Nullhypothesis: Quercus robur and Fagus sylvatica do not differ in their response to warming in spring.
\[ H_0: \text{GDD} = \beta_{\text{species}} + \beta_{\text{treatment}} + \epsilon \]
Alternative hypothesis: Quercus robur and Fagus sylvatica differ in their response to warming in spring.
\[ H_1: \text{GDD} = \beta_{\text{species}} + \beta_{\text{treatment}} + \beta_{\text{species-treament}} + \epsilon \]
Null hypothesis
flowchart TB
S --> R
T --> R
S[species]
T[treatment]
R[response]
Alternative hypothesis
flowchart TB
S --> R
T --> ST
S --> ST
ST --> R
T --> R
S[species]
T[treatment]
R[response]
ST[species-treatment]
This is a simplified version of the data from:
I. Beil, J. Kreyling, C. Meyer, N. Lemcke, and A. V. Malyshev, “Late to bed, late to rise—Warmer autumn temperatures delay spring phenology by delaying dormancy,” Global Change Biology, 2021. doi: 10.1111/gcb.15858.
set.seed(123)
n_data <- 15
df_data <- expand_grid(
species = factor(c("beech", "oak")),
treatment = factor(c("control", "warming"))) %>%
slice(rep(1:n(), each = n_data)) %>%
mutate(GDD =
ifelse(
species == "beech",
ifelse(treatment == "control",
rnorm(n_data, mean = 263, sd = 42),
rnorm(n_data, mean = 197, sd = 20)),
ifelse(treatment == "control",
rnorm(n_data, mean = 166, sd = 45),
rnorm(n_data, mean = 116, sd = 44)))) df_data# A tibble: 60 × 3
species treatment GDD
<fct> <fct> <dbl>
1 beech control 239.
2 beech control 253.
3 beech control 328.
4 beech control 266.
5 beech control 268.
6 beech control 335.
7 beech control 282.
8 beech control 210.
9 beech control 234.
10 beech control 244.
# ℹ 50 more rows
ggplot(df_data, aes(x = treatment, y = GDD)) +
geom_jitter(alpha = 0.5, size = 3, width = 0.1, height = 0) +
facet_wrap(~species) +
theme_light() +
theme(text = element_text(size = 14)) lm1 <- lm(GDD ~ species + treatment + species:treatment, data = df_data)
lm1
Call:
lm(formula = GDD ~ species + treatment + species:treatment, data = df_data)
Coefficients:
(Intercept) speciesoak
269.40 -90.11
treatmentwarming speciesoak:treatmentwarming
-77.33 16.74
\[ \text{GDD} = \left\{ \begin{array}{ll} 269.40 +\epsilon & \text{if control, beech} \\ 269.40 -90.11 +\epsilon & \text{if control, oak} \\ 269.40 -77.33 +\epsilon& \text{if warming, beech} \\ 269.40 -90.11 -77.33 + 16.74 +\epsilon& \text{if warming, oak} \\ \end{array}\right. \]
lm(GDD ~ species + treatment, data = df_data)
Call:
lm(formula = GDD ~ species + treatment, data = df_data)
Coefficients:
(Intercept) speciesoak treatmentwarming
265.21 -81.74 -68.96
\[ \text{GDD} = \left\{ \begin{array}{ll} 265.21 +\epsilon & \text{if control, beech} \\ 265.21 -81.74 +\epsilon & \text{if control, oak} \\ 265.21 -68.96 +\epsilon& \text{if warming, beech} \\ 265.21 -81.74 -68.96 +\epsilon& \text{if warming, oak} \\ \end{array}\right. \]
autoplot(lm1, which = 1:2)drop1(lm1, test = "F")Single term deletions
Model:
GDD ~ species + treatment + species:treatment
Df Sum of Sq RSS AIC F value Pr(>F)
<none> 63632 425.99
species:treatment 1 1051.5 64683 424.97 0.9254 0.3402
how to read the output:
<none> \(\rightarrow\) nothing is removed) is the full modelspecies:treatment interactiondf: change of the degrees of freedom from the full model, i.e. the number of parameters that were droppedSum of Sq: change of the sum of squares from the full model to the reduced modelRSS: residual sum of squares\(\rightarrow\) we cannot reject the null hypothesis that the species do not differ in their response to warming in spring (F = 0.92, df = 1, p = 0.34).
lm_full <- lm1
lm_reduced <- lm(GDD ~ species + treatment, data = df_data)
# number of observations
n <- nrow(df_data)
# sum of squares
RSS_full <- sum(resid(lm_full)^2)
RSS_reduced <- sum(resid(lm_reduced)^2)
sum_sq <- RSS_reduced - RSS_full
# number of parameters
p_reduced <- length(coef(lm_reduced))
p_full <- length(coef(lm_full))
# F statistic
F <- (sum_sq / (p_full - p_reduced )) / (RSS_full / (n - p_full))
# degrees of freedom
df1 <- p_full - p_reduced
df2 <- n - p_full
# p-value
pf(F, df1 = df1, df2 = df2, lower.tail = FALSE)[1] 0.3402041
Hypothesis testing for linear models