Solution to the hypothesis testing exercise with the bacteria dataset

1 Script with output

library(dplyr)
library(readr)
library(ggplot2)
library(ggfortify)

## theme for ggplot
theme_set(theme_classic())
theme_update(text = element_text(size = 14))

df_bacteria <- read_csv("data/06_bacteria.csv")

## fit model with interaction
lm_bacteria <- lm(density ~ species * temperature, data = df_bacteria)

## check assumptions
autoplot(lm_bacteria, which = 1:2)

## hypothesis testing
drop1(lm_bacteria, test = "F")
Single term deletions

Model:
density ~ species * temperature
                    Df Sum of Sq    RSS     AIC F value    Pr(>F)    
<none>                           11.753 -369.98                      
species:temperature  2    78.711 90.464  -67.85  482.18 < 2.2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## to get the coefficient values
summary(lm_bacteria)

Call:
lm(formula = density ~ species * temperature, data = df_bacteria)

Residuals:
     Min       1Q   Median       3Q      Max 
-0.61411 -0.21434  0.00655  0.18925  0.74622 

Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
(Intercept)                  -5.28220    0.44964 -11.748  < 2e-16 ***
speciesspecies2             -10.19381    0.61590 -16.551  < 2e-16 ***
speciesspecies3               3.90402    0.59373   6.575 8.34e-10 ***
temperature                   0.31533    0.01620  19.461  < 2e-16 ***
speciesspecies2:temperature   0.40295    0.02225  18.109  < 2e-16 ***
speciesspecies3:temperature  -0.24009    0.02151 -11.161  < 2e-16 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 0.2857 on 144 degrees of freedom
Multiple R-squared:  0.9797,    Adjusted R-squared:  0.979 
F-statistic:  1389 on 5 and 144 DF,  p-value: < 2.2e-16

2 Explanation of the output

We tested whether we can reject the null hypothesis that the effect (slope) of the temperature is not dependent on the bacteria species. We come to the conclusion that we can reject the null hypothesis (F = 482.2, df = 2, p < 0.00001).

\[ \begin{align} \text{density} &= \cases { -5.28 + 0.32 \cdot \text{temperature} + \epsilon, & \text{if species = 1} \\ -5.28 - 10.19 + (0.32 + 0.40) \cdot \text{temperature} + \epsilon, & \text{if species = 2} \\ -5.28 + 3.90 + (0.32 - 0.24) \cdot \text{temperature} + \epsilon, & \text{if species = 3} \\ } \\ \epsilon &\sim N(0, 0.29) \end{align} \]