Day 4
Freie Universität Berlin @ Theoretical Ecology
January 18, 2024
… can make even small changes frustrating and difficult.
Project setup and structure
Always make your project an R Studio Project (if possible)!
✅ You already did that.
R Studio offers a lot of settings and options.
So have a ☕ and check out Tools -> Global Options and all the other buttons.
Your collaborators and your future self will love you for this.
File names should be
Names should allow for easy searching, grouping and extracting information from file names.
📄 2023-04-20 temperature göttingen.csv
📄 2023-04-20 rainfall göttingen.csv
📄 2023-04-20_temperature_goettingen.csv
📄 2023-04-20_rainfall_goettingen.csv
Which file names would you like to read at 4 a.m. in the morning?
📄 01preparedata.R
📄 01firstscript.R
📄 01_prepare-data.R
📄 01_temperature-trend-analysis.R
If you order your files by name, the ordering should make sense:
01
, 02
, …)YYYY-MM-DD
format📄 2023-04-20_temperature_goettingen.csv
📄 2023-04-21_temperature_goettingen.csv
📄 01_prepare-data.R
📄 02_lm-temperature-trend.R
Good practice R coding
library()
calls on top# Load data ---------------------------------------------------------------
input_data <- read_csv(input_file)
# Plot data ---------------------------------------------------------------
ggplot(input_data, aes(x = x, y = y)) +
geom_point()
Ctrl/Cmd + Shift + R
snake_case
for longer variable names# Good
day_one
day_1
# Bad
DayOne
dayone
first_day_of_the_month
dm1
# Good
x[, 1]
# Bad
x[ , 1]
x[,1]
x[ ,1]
# Good
mean(x, na.rm = TRUE)
# Bad
mean (x, na.rm = TRUE)
mean ( x, na.rm = TRUE )
<-
, ==
, +
, etc.)# Good
height <- (feet * 12) + inches
mean(x, na.rm = TRUE)
# Bad
height<-feet*12+inches
mean(x, na.rm=TRUE)
<-
, ==
, +
, etc.)|>
) followed by new line# Good
iris |>
summarize_if(is.numeric, mean, .by = Species) |>
arrange(desc(Sepal.Length))
# Bad
iris|>summarize_if(is.numeric, mean, .by = Species)|>arrange(desc(Sepal.Length))
<-
, ==
, +
, etc.)|>
, |>
) followed by new line+
in ggplot followed by new line# Good
ggplot(aes(x = Sepal.Width, y = Sepal.Length, color = Species)) +
geom_point()
# Bad
ggplot(aes(x = Sepal.Width, y = Sepal.Length, color = Species))+geom_point()
Try to limit your line width to 80 characters.
# Bad
iris |> summarise(Sepal.Length = mean(Sepal.Length), Sepal.Width = mean(Sepal.Width), Species = n_distinct(Species), .by = Species)
# Good
iris |>
summarise(
Sepal.Length = mean(Sepal.Length),
Sepal.Width = mean(Sepal.Width),
Species = n_distinct(Species),
.by = Species
)
Do I really have to remember all of this?
Luckily, no! R and R Studio provide some nice helpers
R Studio has style diagnostics that tell you where something is wrong
{styler}
The styler package package automatically styles your files and projects according to the tidyverse style guide.
# install from CRAN
install.packages("styler")
{styler}
… allow you and others to work productively.
Introduction to R