# A tibble: 123 × 7
genus species light temp moisture reaction nitrogen
<chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Acer platanoides 4 6 NA NA NA
2 Acer pseudoplatanus 4 NA 6 NA 7
3 Achillea millefolium 8 NA 4 NA 5
4 Aegopodium podagraria 5 5 6 7 8
5 Aesculus hippocastanum 5 6 NA NA NA
6 Alliaria petiolata 5 6 5 7 9
7 Allium canadense NA NA NA NA NA
8 Allium schoenoprasum 7 NA NA 7 2
9 Allium strictum 9 6 2 6 1
10 Alnus glutinosa 5 5 9 6 NA
# ℹ 113 more rows
In the vegetation data there is only one column with the species name, while in the indicator dataset there is one column for the genus and one for the species. This has to match to have proper keys in the tables. The primary key is the the species in the indicator value dataset, while the foreign key is the species column in the vegetation survey dataset.
Here, we unite the genus and species columns in the indicator table, but we could also separate the species column in the vegetation table. Both approaches should work,
indi2 <- indi %>%unite("species", genus:species, sep =" ")
Now, we check, if we have proper primary key with unique values only.
indi2 %>%count(species) %>%filter(n >2)
# A tibble: 0 × 2
# ℹ 2 variables: species <chr>, n <int>
Now, we can join the two tables and calculate the mean indicator values for every plot. When we want to compare the plot types later, this variable also has to be included in the grouping call.
# Join the two tables by adding the indicator values to the vegetation surveysveg2 <- veg %>%left_join(indi2)# Calculate the mean indicator values for each plotindi_mean <- veg2 %>%group_by(plotID, plot_type) %>%summarise(light_mean =mean(light, na.rm = T),light_mean_weighted =weighted.mean(light, cover_londo, na.rm = T))
And finally, we create boxplots to compare the indicator values between the open and half-open sites
Indeed, the average light indicator value if higher in the open sites. This reflects that species on open sites are adapted to higher light availability.
2 Solution as one script without output
# Read the dataindi <-read_csv("data/03_ellenberg_indicator_values.csv")veg <-read_csv("data/03_vegetation_berlin.csv")# Create a single column with the species and genus names in the indicator table# (Alternatively, you could split the single column in the vegetation dataset into two columns)indi2 <- indi %>%unite("species", genus:species, sep =" ")# Check if you have a proper primary key with unique values onlyindi2 %>%count(species) %>%filter(n >2)# Join the two tables by adding the indicator values to the vegetation surveysveg2 <- veg %>%left_join(indi2)# Calculate the mean indicator values for each plotindi_mean <- veg2 %>%group_by(plotID, plot_type) %>%summarise(light_mean =mean(light, na.rm = T),light_mean_weighted =weighted.mean(light, cover_londo, na.rm = T))# Plot the differences as boxplotggplot(indi_mean, aes(plot_type, light_mean_weighted)) +geom_boxplot() +ylab("Average light indicator value")