In this notebook, we’ll get some practice making plots with ggplot using the Palmer Penguins dataset.
Load packages and set conflicted preferences:
library(dplyr)
library(ggplot2)
## Load the conflicted package
library(conflicted)
conflict_prefer("filter", "dplyr", quiet = TRUE)
conflict_prefer("count", "dplyr", quiet = TRUE)
conflict_prefer("select", "dplyr", quiet = TRUE)
conflict_prefer("arrange", "dplyr", quiet = TRUE)
Next, we load the Palmer Penguins data frame:
library(palmerpenguins)
glimpse(penguins)
Rows: 344
Columns: 8
$ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Ad~
$ island <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgersen, Torgersen, Torgersen, Torgersen, Torgersen, Torgers~
$ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1, 42.0, 37.8, 37.8, 41.1, 38.6, 34.6, 36.6, 38.7, 42.5, ~
$ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1, 20.2, 17.1, 17.3, 17.6, 21.2, 21.1, 17.8, 19.0, 20.7, ~
$ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 186, 180, 182, 191, 198, 185, 195, 197, 184, 194, 174, 18~
$ body_mass_g <int> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475, 4250, 3300, 3700, 3200, 3800, 4400, 3700, 3450, 4500, ~
$ sex <fct> male, female, female, NA, female, male, female, male, NA, NA, NA, NA, female, male, male, female, female, ~
$ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007~
Our first ggplot will be a simple scatterplot of bill length and flipper length:
ggplot(data = penguins,
mapping = aes(x = flipper_length_mm, y = bill_length_mm)) +
geom_point()
Next, change the color of the dots by manually specifying a color value in the geom function:
ggplot(data = penguins,
mapping = aes(x = flipper_length_mm, y = bill_length_mm)) +
geom_point(color = "blue")
Next, we color them by species:
ggplot(data = penguins,
mapping = aes(x = flipper_length_mm, y = bill_length_mm, color = species)) +
geom_point()
Don’t like the default colors (which are not color-blind friendly)?
We can specify our preferred palette by tacking on
scale_color_manual()
to our expression:
ggplot(data = penguins,
mapping = aes(x = flipper_length_mm, y = bill_length_mm, color = species)) +
geom_point() +
scale_color_manual(values = c("darkorange","darkorchid","cyan4"))
1A) Make a scatterplot with bill depth on the x-axis and body mass on the y-axis. [Answer]
## Your answer here
1B) Visualize your plot of bill depth vs. body mass by species, using different i) colors, and ii) shapes. [Hint] [Solution]
## Your answer here
Another way we can differentiate the points is by using
facets (i.e., a separate plot for each group). We can ask for
facets by adding facet_wrap()
or facet_grid()
to our plot expression:
ggplot(data = penguins,
mapping = aes(x = flipper_length_mm, y = bill_length_mm)) +
geom_point() +
facet_wrap(~species)
You can use geom_boxplot()
to draw boxplots.
To create multiple boxplots for different groups of data, pass a
column that has categorical (character or factor) values as either the
x
or y
property in aes()
. ggplot
will take care of the rest!
ggplot(penguins, aes(x = species, y = flipper_length_mm)) +
geom_boxplot()
Make a histogram of flipper length for each species. [Hint] [Solution]
## Your answer here
Remember to save your work to render a HTML file.