Appendix C: Appendix: Tidyverse and Tibbles
C.1 Overview
The Tidyverse is a collection of R packages designed for data science.
They share a common design philosophy and work seamlessly together.
Core packages include:
- ggplot2: data visualization
- dplyr: data manipulation
- tidyr: data tidying
- readr: data import
- purrr: functional programming
- tibble: modern data frames
- stringr: string manipulation
- forcats: working with factors
You load them all with:
D 1. What Are Tibbles?
Tibbles are modern replacements for base R data frames.
D.0.1 Key Features:
- Don’t convert strings to factors automatically
- Never change variable names
- Print in a cleaner, more readable way
- Show only the first 10 rows and as many columns as fit on screen
Example:
# A tibble: 5 × 3
x y z
<int> <dbl> <chr>
1 1 1 a
2 2 4 b
3 3 9 c
4 4 16 d
5 5 25 e
E 2. Differences from Data Frames
- Subsetting with
$works the same, but[[is stricter - Tibbles don’t do partial matching
- Printing is truncated by default (no flooding the console)
tb$y[1] 1 4 9 16 25
tb[["z"]][1] "a" "b" "c" "d" "e"
F 3. Creating Tibbles
You can create tibbles manually with tibble() or convert data frames with as_tibble().
df <- data.frame(a = 1:3, b = letters[1:3])
tb2 <- as_tibble(df)G 4. Working with Tibbles
Tibbles work seamlessly with all dplyr verbs:
tb3 <- tibble(
x = 1:6,
y = c("a", "a", "b", "b", "c", "c")
)
tb3 |>
dplyr::group_by(y) |>
dplyr::summarize(mean_x = mean(x))# A tibble: 3 × 2
y mean_x
<chr> <dbl>
1 a 1.5
2 b 3.5
3 c 5.5
H 5. Best Practices with Tibbles
- Always use
tibble()for clean, predictable data structures - Avoid row names; instead, use an explicit column
- Use
glimpse()for quick inspection - Use
print(n = Inf)to see all rows when needed
I 6. When to Convert Back to Data Frames
Some base R functions don’t work with tibbles.
Use as.data.frame() if you need to revert:
df_back <- as.data.frame(tb)I.1 In-Class Exercise
- Create a tibble with three columns: name, age, and score.
- Use
mutate()to add a new columngradebased onscore.
- Group by grade and calculate the average age.
J Conclusion
Tibbles are at the heart of the Tidyverse workflow, offering: - Clean printing - Safer subsetting - Compatibility with the pipe operator and dplyr verbs
Use them as your default data structure in this course.