ggplot2
: More in OneClick on the three horizontally stacked lines at the bottom left corner of the slide, then you will see table of contents, and you can jump to the section you want
Hit letter “o” on your keyboard and you will have a panel view of all the slides
So far, we have learned the basics of ggplot2
and how to create popular types of figures. We can make a figure much more informative by making its aesthetics data-dependent.
For example, suppose you are interested in comparing the history of irrigated corn yield by state in a line plot. So, you want to create a line for each state and make the lines distinguishable so the readers know which line is for which state like this:
We can make the aesthetics of a figure data-dependent by specifying which variable you use for aesthetics differentiation INSIDE aes()
.
Here is an example:
In this code, color = state_name
is inside aes()
and it tells R to divide the data into the groups of State and draw a line by state_name
(by state) where the lines are color-differentiated.
A legend is automatically generated.
This exercise use the diamonds
dataset from the ggplot2()
package. First, load the dataset and extract observations with Premium
cut whose color is one of E
, I
, and F
:
Using premium
, create a scatter plot of price
(y-axis) against depth
(x-axis) by clarity
:
Sometimes, you would like to visualize information across groups on separate panels.
Too much information in one panel?
On separate panels (faceting)?
We can make faceted figures by adding either facet_wrap
or facet_grid()
in which you specify which variable to use for faceting.
Here is an example:
In this code, facet_wrap(state_name ~ .)
is added to a simple boxplot, which tells R to make a boxplot by state_name
(state).
Note
.
in state_name ~ .
means non (facet by no variable).
Two-way faceting will
divide the data into groups where each group has a unique combination of the two faceting variables
create a plot for each group
Example
Filter county_yield
to those in 2017 and 2018.
Create a faceted density plots.
facet_wrap
facet_grid
Note
Unlike facet_wrap()
, which side you put faceting variables matters a lot.
In the code above, state_name
values become the rows, and year
values become columns.
facet_grid()
allows
the figures in different columns to have different scales for the x-axis (figures in the same column have the same scale for the x-axis)
the figures in different rows to have different scales for the y-axis (figures in the same rows have the same scale for the x-axis)
Create a variable that has the values you want to use as labels and use it as a faceting variable:
Using premium
, create scatter plots of price
(y-axis) against carat
(x-axis) by color
on separate panels as shown on the right.
We have seen
aes(color = var)
inside the geom_*()
function)Important
The dataset has to be in long format to create these types of figures!!
For example consider the following dataset in a wide format:
This dataset has county-level yields for Nebraska, Colorado, and Kansas stored in variables named 2000
and 2001
(they themselves represent years).
Imagine creating boxplots of corn yield fill color-differentiated by state and faceted by year….You actually cannot specify facet_grid()
properly because you do not have a single variable that represents year
.
You will find that reshaping wide datasets using pivot_longer()
is very useful in creating figures.
Important
ggplot()
, then the dataset is used in ALL of the subsequent geom_*()
unless otherwise specifiedgeom_*()
, the dataset is used only for the geom_*()
over-riding the global dataset set inside ggplot()
.This works with county_yield
used in both geom_point()
and geom_smooth()
.
This does not work because no global dataset is set inside ggplot()
and no dataset is supplied to geom_smooth()
.
To use multiple datasets inside a single ggplot
object (or a figure), you just need to specify what dataset to use locally inside individual geom_*()
s.