modelsummary
Click on the three horizontally stacked lines at the bottom left corner of the slide, then you will see table of contents, and you can jump to the section you want
Hit letter “o” on your keyboard and you will have a panel view of all the slides
modelsummary
packageWe use county_yield
throughout this lecture.
First install the r.spatial.workshop.datasets
package.
Then, get the data:
# A tibble: 1,956 × 10
corn_yield soy_yield year county_code state_name d0_5_9 d1_5_9 d2_5_9 d3_5_9
<dbl> <dbl> <int> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 123 42 2000 053 Kansas 2.49 2.87 0.134 0
2 188. NA 2017 095 Kansas 8.72 0 0 0
3 169. 58.4 2016 095 Kansas 1 0 0 0
4 198. NA 2015 095 Kansas 1.76 1.21 2.09 0
5 152. NA 2012 095 Kansas 6.28 1.47 9.54 4.46
6 170 42 2007 095 Kansas 0 0 0 0
7 193 49 2005 095 Kansas 4.32 0 0 0
8 173 47 2003 095 Kansas 2.29 5.16 4.46 1.09
9 165 40 2002 095 Kansas 3.71 1.48 1.90 0
10 171 52 2001 095 Kansas 9.88 0.188 0 0
# ℹ 1,946 more rows
# ℹ 1 more variable: d4_5_9 <dbl>
Variable Definitions
soy_yield
: soybean yield (bu/acre)corn_yield
: corn yield (bu/acre)d0_5_9
: ratio of weeks under drought severity of 0 from May to Septemberd1_5_9
: ~ drought severity of 1 from May to Septemberd2_5_9
: ~ drought severity of 2 from May to Septemberd3_5_9
: ~ drought severity of 3 from May to Septemberd4_5_9
: ~ drought severity of 4 from May to SeptemberLet’s first run regressions which we are going to report in tables.
model_1_corn <- lm(corn_yield ~ d1_5_9 + d2_5_9, data = county_yield)
model_2_corn <- lm(corn_yield ~ d1_5_9 + d2_5_9 + d3_5_9 + d4_5_9, data = county_yield)
model_1_soy <- lm(soy_yield ~ d1_5_9 + d2_5_9, data = county_yield)
model_2_soy <- lm(soy_yield ~ d1_5_9 + d2_5_9 + d3_5_9 + d4_5_9, data = county_yield)
Get White-Huber robust variance-covariance matrix for the regressions:
You can supply a list of regression results to modelsummary::msummary()
to create a default regression table.
(1) | (2) | (3) | (4) | |
---|---|---|---|---|
(Intercept) | 181.978 | 183.882 | 56.049 | 56.202 |
(0.678) | (0.690) | (0.288) | (0.295) | |
d1_5_9 | -0.216 | -0.367 | -0.062 | -0.069 |
(0.135) | (0.133) | (0.055) | (0.055) | |
d2_5_9 | -1.081 | -0.836 | -0.327 | -0.298 |
(0.124) | (0.129) | (0.053) | (0.055) | |
d3_5_9 | -0.754 | -0.173 | ||
(0.158) | (0.090) | |||
d4_5_9 | -2.194 | -0.137 | ||
(0.320) | (0.213) | |||
Num.Obs. | 1956 | 1956 | 1100 | 1100 |
R2 | 0.050 | 0.099 | 0.047 | 0.052 |
R2 Adj. | 0.049 | 0.097 | 0.046 | 0.049 |
AIC | 17806.4 | 17708.0 | 7475.7 | 7474.2 |
BIC | 17828.8 | 17741.4 | 7495.8 | 7504.2 |
Log.Lik. | -8899.218 | -8847.985 | -3733.873 | -3731.078 |
F | 51.768 | 53.480 | 27.207 | 15.043 |
RMSE | 22.89 | 22.30 | 7.21 | 7.19 |
modelsummary::msummary()
offers multiple options to modify the default regression table to your liking:
title
: put a title to the tablestars
: place significance symbols (and modify the symbol placement rules)coef_map
: change the order and label of variable namesnotes
: add footnotesfmt
: change the format of numbersstatistic
: type of statistics you display along with coefficient estimatesgof_map
: define which model statistics to displaygof_omit
: define which model statistics to omit from the default selection of model statisticsadd_rows
: add rows of arbitrary contents to the tableAdd stars = TRUE
in modelsummary::msummary()
to add significance markers.
You can modify significance levels and markers by supplying a named vector with its elements being the significance levels and their corresponding names being the significance markers.
Example:
(1) | |
---|---|
+ p < 0.1, &+ p < 0.05, +*+ p < 0.01 | |
(Intercept) | 181.978+*+ |
(0.678) | |
d1_5_9 | -0.216 |
(0.135) | |
d2_5_9 | -1.081+*+ |
(0.124) | |
Num.Obs. | 1956 |
R2 | 0.050 |
R2 Adj. | 0.049 |
AIC | 17806.4 |
BIC | 17828.8 |
Log.Lik. | -8899.218 |
F | 51.768 |
RMSE | 22.89 |
coef_map
allows you to reorder coefficient rows and change their labels.
Similarly with the stars
option, you supply a named vector where its names are the existing labels and their corresponding elements are the new labels.
In the table, the coefficient rows are placed in the order they are ordered in the named vector.
#--- define a coef_map vector ---#
coef_map_vec <- c(
"d1_5_9" = "DI: category 1",
"d2_5_9" = "DI: category 2",
"d3_5_9" = "DI: category 3",
"d4_5_9" = "DI: category 4",
"(Intercept)" = "Constant"
)
#--- create a table ---#
modelsummary::msummary(
list(model_2_corn, model_2_soy),
coef_map = coef_map_vec
)
(1) | (2) | |
---|---|---|
DI: category 1 | -0.367 | -0.069 |
(0.133) | (0.055) | |
DI: category 2 | -0.836 | -0.298 |
(0.129) | (0.055) | |
DI: category 3 | -0.754 | -0.173 |
(0.158) | (0.090) | |
DI: category 4 | -2.194 | -0.137 |
(0.320) | (0.213) | |
Constant | 183.882 | 56.202 |
(0.690) | (0.295) | |
Num.Obs. | 1956 | 1100 |
R2 | 0.099 | 0.052 |
R2 Adj. | 0.097 | 0.049 |
AIC | 17708.0 | 7474.2 |
BIC | 17741.4 | 7504.2 |
Log.Lik. | -8847.985 | -3731.078 |
F | 53.480 | 15.043 |
RMSE | 22.30 | 7.19 |
coef_omit()
lets you omit coefficient rows from the default selections.
You supply a vector of strings (and/or regular expressions), and coefficient rows that match the string pattern will be omitted.
Example
d2
matches with d2_5_9
, and rows associated with the coefficients on d2_5_9
are removed.
(1) | (2) | |
---|---|---|
(Intercept) | 183.882 | 56.202 |
(0.690) | (0.295) | |
d1_5_9 | -0.367 | -0.069 |
(0.133) | (0.055) | |
d3_5_9 | -0.754 | -0.173 |
(0.158) | (0.090) | |
d4_5_9 | -2.194 | -0.137 |
(0.320) | (0.213) | |
Num.Obs. | 1956 | 1100 |
R2 | 0.099 | 0.052 |
R2 Adj. | 0.097 | 0.049 |
AIC | 17708.0 | 7474.2 |
BIC | 17741.4 | 7504.2 |
Log.Lik. | -8847.985 | -3731.078 |
F | 53.480 | 15.043 |
RMSE | 22.30 | 7.19 |
gof_omit()
lets you omit model statistics like \(R^2\) from the default selections.
You supply a vector of strings (and/or regular expressions), and statistics that match the string pattern will be omitted.
Example
IC
matches with AIC
and BIC
, and Adj
matches with R2 Adj
(1) | (2) | |
---|---|---|
(Intercept) | 183.882 | 56.202 |
(0.690) | (0.295) | |
d1_5_9 | -0.367 | -0.069 |
(0.133) | (0.055) | |
d2_5_9 | -0.836 | -0.298 |
(0.129) | (0.055) | |
d3_5_9 | -0.754 | -0.173 |
(0.158) | (0.090) | |
d4_5_9 | -2.194 | -0.137 |
(0.320) | (0.213) | |
Num.Obs. | 1956 | 1100 |
R2 | 0.099 | 0.052 |
Log.Lik. | -8847.985 | -3731.078 |
F | 53.480 | 15.043 |
RMSE | 22.30 | 7.19 |
add_rows()
can be used to insert arbitrary rows into a table. Adding rows using add_rows()
is a two-step process:
data.frame
(or tibble
) to insert#--- create a table (data.frame) to insert ---#
(
rows <- data.frame(
c1 = c("FE: County", "FE: Year"),
c2 = c("Yes", "Yes"),
c3 = c("No", "Now")
)
)
c1 c2 c3
1 FE: County Yes No
2 FE: Year Yes Now
data.frame
by attr(data.frame, "position") <- row number
.#--- tell where to insert ---#
attr(rows, "position") <- c(3, 4)
#--- create a table with rows inserted ---#
modelsummary::msummary(
list(Moddel1 = model_2_corn, Model2 = model_2_soy),
gof_omit ="IC|Adj",
coef_omit = "d",
add_row = rows #<<
)
Moddel1 | Model2 | |
---|---|---|
(Intercept) | 183.882 | 56.202 |
(0.690) | (0.295) | |
FE: County | Yes | No |
FE: Year | Yes | Now |
Num.Obs. | 1956 | 1100 |
R2 | 0.099 | 0.052 |
Log.Lik. | -8847.985 | -3731.078 |
F | 53.480 | 15.043 |
RMSE | 22.30 | 7.19 |
It is often the case that we replace the default variance-covariance matrix with a robust one for valid statistical testing.
You can achieve this using the statistic_override
option. You will give a list of variance-covariance matrices in the order their corresponding regression results appear on the table.
Syntax:
Default:
modelsummary::msummary(
list(Moddel1 = model_2_corn, Model2 = model_2_soy),
gof_omit = "IC|R",
coef_omit = "d3|d4",
#--- no statistical override ---#
)
Moddel1 | Model2 | |
---|---|---|
(Intercept) | 183.882 | 56.202 |
(0.690) | (0.295) | |
d1_5_9 | -0.367 | -0.069 |
(0.133) | (0.055) | |
d2_5_9 | -0.836 | -0.298 |
(0.129) | (0.055) | |
Num.Obs. | 1956 | 1100 |
Log.Lik. | -8847.985 | -3731.078 |
F | 53.480 | 15.043 |
VCOV swapped:
modelsummary::msummary(
list(Moddel1 = model_2_corn, Model2 = model_2_soy),
gof_omit = "IC|R",
coef_omit = "d3|d4",
statistic_override = list(vcov_2_corn, vcov_2_soy) #<<
)
Moddel1 | Model2 | |
---|---|---|
(Intercept) | 183.882 | 56.202 |
(0.690) | (0.295) | |
d1_5_9 | -0.367 | -0.069 |
(0.133) | (0.055) | |
d2_5_9 | -0.836 | -0.298 |
(0.129) | (0.055) | |
Num.Obs. | 1956 | 1100 |
Log.Lik. | -8847.985 | -3731.078 |
F | 53.480 | 15.043 |
You can save the table to a file by providing a file name to the output
option.
The supported file types are:
Example:
The docx
option may be particularly useful for those who want to put finishing touches on the table manually on WORD:
Using the output
option in modelsummary::msummary()
, you can turn the regression table into R objects that are readily modifiable by the gt
, kableExtra
, and flextable
packages.
Example: flextable
#--- create a regression table and turn it into a gt_tbl ---#
reg_table_ft <- list(model_1_corn, model_1_soy)%>%
modelsummary::msummary(output = "flextable")
#--- check the class ---#
class(reg_table_ft)
[1] "flextable"
Example: gt
Now that the regression table created using modelsummary::msummary()
with output = "flextable"
is a flextable
object.
So, we can use our knowledge of the flextable
package to further modify the regression table if you would like.
For the details of how to use the flextable
package visit the flextable
lecture notes.
Here I will just given you an example of the use of flextable
operations.
Example
| Corn 1 | Corn 2 | Soy 1 | Soy 2 |
---|---|---|---|---|
(Intercept) | 181.978 | 183.882 | 56.049 | 56.202 |
(0.678) | (0.690) | (0.288) | (0.295) | |
d1_5_9 | -0.216 | -0.367 | -0.062 | -0.069 |
(0.135) | (0.133) | (0.055) | (0.055) | |
d2_5_9 | -1.081 | -0.836 | -0.327 | -0.298 |
(0.124) | (0.129) | (0.053) | (0.055) | |
d3_5_9 | -0.754 | -0.173 | ||
(0.158) | (0.090) | |||
d4_5_9 | -2.194 | -0.137 | ||
(0.320) | (0.213) | |||
Num.Obs. | 1956 | 1956 | 1100 | 1100 |
R2 | 0.050 | 0.099 | 0.047 | 0.052 |
Log.Lik. | -8899.218 | -8847.985 | -3733.873 | -3731.078 |
F | 51.768 | 53.480 | 27.207 | 15.043 |
RMSE | 22.89 | 22.30 | 7.21 | 7.19 |
Now that the regression table is a gt_tbl
object, we can use our knowledge of the gt
package to modify the regression table.
For the details of how to use the gt
package go here. Here I will just given you an example of the use of gt
operations.
Example
list(
"Corn 1" = model_1_corn,
"Corn 2" = model_2_corn,
"Soy 1" = model_1_soy,
"Soy 2" = model_2_soy
) %>%
modelsummary::msummary(
output = "gt",
gof_omit ="IC|Adj",
) %>%
gt::tab_spanner( #<<
label = "Corn", #<<
columns = vars("Corn 1", "Corn 2") #<<
) %>% #<<
gt::tab_style( #<<
style = cell_text(color = 'red'), #<<
locations = cells_body(rows = 7:8) #<<
) #<<
Corn | Soy 1 | Soy 2 | ||
---|---|---|---|---|
Corn 1 | Corn 2 | |||
(Intercept) | 181.978 | 183.882 | 56.049 | 56.202 |
(0.678) | (0.690) | (0.288) | (0.295) | |
d1_5_9 | -0.216 | -0.367 | -0.062 | -0.069 |
(0.135) | (0.133) | (0.055) | (0.055) | |
d2_5_9 | -1.081 | -0.836 | -0.327 | -0.298 |
(0.124) | (0.129) | (0.053) | (0.055) | |
d3_5_9 | -0.754 | -0.173 | ||
(0.158) | (0.090) | |||
d4_5_9 | -2.194 | -0.137 | ||
(0.320) | (0.213) | |||
Num.Obs. | 1956 | 1956 | 1100 | 1100 |
R2 | 0.050 | 0.099 | 0.047 | 0.052 |
Log.Lik. | -8899.218 | -8847.985 | -3733.873 | -3731.078 |
F | 51.768 | 53.480 | 27.207 | 15.043 |
RMSE | 22.89 | 22.30 | 7.21 | 7.19 |
county_yield %>%
dplyr::filter(year %in% 2010:2012) %>%
modelsummary::datasummary(
(Year = factor(year)) * (
(`Corn Yield (bu/acre)` = corn_yield) +
(`Soy Yield (bu/acre)` = soy_yield) +
(`DI: category 4` = d4_5_9)
) ~
state_name * (Mean + SD) ,
data = .
)
Colorado | Kansas | Nebraska | |||||
---|---|---|---|---|---|---|---|
Year | Mean | SD | Mean | SD | Mean | SD | |
2010 | Corn Yield (bu/acre) | 196.08 | 12.96 | 182.38 | 17.12 | 182.37 | 14.80 |
Soy Yield (bu/acre) | 58.79 | 4.30 | |||||
DI: category 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
2011 | Corn Yield (bu/acre) | 186.25 | 12.76 | 160.56 | 29.69 | 178.32 | 16.00 |
Soy Yield (bu/acre) | 60.35 | 5.39 | |||||
DI: category 4 | 0.00 | 0.00 | 1.52 | 3.33 | 0.00 | 0.00 | |
2012 | Corn Yield (bu/acre) | 160.50 | 31.69 | 161.33 | 17.44 | 185.91 | 18.44 |
Soy Yield (bu/acre) | 59.80 | 5.21 | |||||
DI: category 4 | 1.79 | 1.60 | 6.16 | 3.59 | 3.05 | 2.65 |
modelsummary::datasummary()
Syntax:
formula
has two sides separated by ~
just like formula for regression.
Variables/statistics on the left-hand side (right-hand side) comprise rows (columns).
Example
Mean | |
---|---|
corn_yield | 178.25 |
Switching the order changes the structure of the resulting table:
The modelsummary
package offers multiple summary functions of its own:
Mean
SD
Min
Max
P0
, P25
, P50
, P75
, P100
Histogram
These functions have na.rm = NA
hidden inside it, so they avoid having NA when simply applying their counterparts from the base
package.
For example, compare these two:
Mean | |
---|---|
corn_yield | 178.25 |
You can use a user-defined function that takes a vector of values and return a single value.
Example:
You can add more variables and statistics using +
.
Example:
modelsummary::datasummary(
corn_yield + soy_yield + d0_5_9 + d1_5_9
~ Mean + SD+ MinMax + Histogram,
data = county_yield
)
Mean | SD | MinMax | Histogram | |
---|---|---|---|---|
corn_yield | 178.25 | 23.50 | [0, 234.3] | ▁▄▇▆▁ |
soy_yield | 54.95 | 7.39 | [15, 74.3] | ▁▄▇▆▃▁ |
d0_5_9 | 3.92 | 3.94 | [0, 21.3569] | ▇▃▃▂▁ |
d1_5_9 | 3.15 | 4.15 | [0, 21.4838] | ▇▁▁▁▁ |
For each of the variables on the left-hand side, each of the statistics on the right-hand side is calculated and displayed.
You can use All()
to create a summary table for all the numeric variables in the dataset.
At the moment, All()
does not work on tibble
. So, if your dataset is tibble
, convert it to a data.frame
on the fly in the code like below:
Example:
tablesummary()
You can nest categorical variables with *
, meaning you can get summary statistics for each value of the categorical variable (like group_by() %>% summarize()
).
Syntax
Examples:
modelsummary::datasummary(
corn_yield + soy_yield + d0_5_9 + d1_5_9
~ state_name * (Mean + SD) + MinMax, #<<
data = county_yield
)
Colorado | Kansas | Nebraska | |||||
---|---|---|---|---|---|---|---|
Mean | SD | Mean | SD | Mean | SD | MinMax | |
corn_yield | 168.26 | 30.64 | 173.06 | 24.32 | 181.65 | 21.32 | [0, 234.3] |
soy_yield | 50.74 | 7.34 | 55.80 | 7.11 | [15, 74.3] | ||
d0_5_9 | 4.23 | 4.67 | 3.69 | 3.81 | 3.97 | 3.89 | [0, 21.3569] |
d1_5_9 | 2.66 | 3.52 | 2.96 | 4.19 | 3.28 | 4.20 | [0, 21.4838] |
For each value of state_name
(Nebraska
, Colorado
, Kansas
), Mean
and SD
are shown for each of the variables on the left-hand side. But, MinMax
is for the entire sample.
You can nest with multiple categorical variables by multiplying stats with multiple categorical variables.
Example:
county_yield %>%
dplyr::filter(year %in% 2011:2012) %>%
dplyr::filter(state_name %in% c("Kansas", "Nebraska")) %>%
modelsummary::datasummary(
corn_yield + soy_yield + d0_5_9 + d1_5_9
~ factor(year) * state_name * (Mean + SD) + MinMax, #<<
data = .
)
2011 | 2012 | ||||||||
---|---|---|---|---|---|---|---|---|---|
Kansas | Nebraska | Kansas | Nebraska | ||||||
Mean | SD | Mean | SD | Mean | SD | Mean | SD | MinMax | |
corn_yield | 160.56 | 29.69 | 178.32 | 16.00 | 161.33 | 17.44 | 185.91 | 18.44 | [100, 217] |
soy_yield | 60.35 | 5.39 | 59.80 | 5.21 | [48, 70.3] | ||||
d0_5_9 | 3.52 | 3.18 | 2.86 | 2.01 | 2.15 | 1.25 | 3.11 | 1.34 | [0, 8.7386] |
d1_5_9 | 5.05 | 3.28 | 0.01 | 0.05 | 2.62 | 1.17 | 2.74 | 1.39 | [0, 10.1494] |
For each of the unique combinations of state_name
(Nebraska
, Kansas
) and year
(2011, 2012), Mean
and SD
are shown for each of the variables on the left-hand side. But, MinMax
is for the entire sample.
By default variable and statistics names are used as the labels in the table.
You can provide labels by the following syntax: (label = variable/stat)
Example:
county_yield %>%
dplyr::filter(year %in% 2011:2012) %>%
dplyr::filter(state_name %in% c("Kansas", "Nebraska")) %>%
modelsummary::datasummary(
(`Corn Yield (bu/acre)` = corn_yield) #<<
~ state_name * (Mean + (Std.Dev. = SD)), #<<
data = .
)
Kansas | Nebraska | |||
---|---|---|---|---|
Mean | Std.Dev. | Mean | Std.Dev. | |
Corn Yield (bu/acre) | 160.99 | 23.31 | 181.95 | 17.56 |
corn_yield
is labeled as Corn Yield (bu/acre)
SD
is labeled as Std.Dev.
.content-box-red[Note: when you have spaces in the label, surround the label with back quotes.]
If you do not like this way of changing labels, you can always use gt
package.
You can pass option arguments to the stats function by: stat * Argument(options)
Example:
county_yield %>%
dplyr::filter(year %in% 2011:2012) %>%
dplyr::filter(state_name %in% c("Kansas", "Nebraska")) %>%
modelsummary::datasummary(
corn_yield
~ state_name * (mean + sd) * Arguments(na.rm = TRUE) + #<<
quantile * Arguments(prob = 0.1, na.rm = TRUE), #<<
data = .
)
Kansas | Nebraska | ||||
---|---|---|---|---|---|
mean | sd | mean | sd | quantile | |
corn_yield | 160.99 | 23.31 | 181.95 | 17.56 | 148.52 |
(mean + sd) * Arguments(na.rm = TRUE)
adds na.rm = TRUE
option to mean()
and sd()
quantile * Arguments(prob = 0.1, na.rm = TRUE)
adds prob = 0.1
and na.rm = TRUE
to quantileo()
Example
county_yield %>%
dplyr::filter(year %in% 2011:2012) %>%
dplyr::filter(state_name %in% c("Kansas", "Nebraska")) %>%
modelsummary::datasummary(
corn_yield
~ state_name * (mean + sd) * Arguments(na.rm = TRUE) +
quantile * Arguments(prob = 0.1, na.rm = TRUE),
data = .,
title = "A title", #<<
notes = c("first note", "second note") #<<
)
Kansas | Nebraska | ||||
---|---|---|---|---|---|
mean | sd | mean | sd | quantile | |
first note | |||||
second note | |||||
corn_yield | 160.99 | 23.31 | 181.95 | 17.56 | 148.52 |
You can use align
to align columns. Available alignment are:
l
: leftr
: rightc
: centerInside align()
, you provide a sequence of the option letters (e.g., "lrcle"
)
The n
th letter corresponds to n
th column.
Example:
county_yield %>%
dplyr::filter(year %in% 2011:2012) %>%
dplyr::filter(state_name %in% c("Kansas", "Nebraska")) %>%
modelsummary::datasummary(
corn_yield
~ state_name * (`This is M E A N` = mean) * Arguments(na.rm = TRUE) +
(`This is Q U A N T I L E` = quantile) * Arguments(prob = 0.1, na.rm = TRUE),
data = .,
align = "lrlc" #<<
)
Kansas | Nebraska | ||
---|---|---|---|
This is M E A N | This is M E A N | This is Q U A N T I L E | |
corn_yield | 160.99 | 181.95 | 148.52 |
You can use the output
option to either export the table as a file or save it as R objects which you can further modify.
This works exactly the same way as the modelsummary::msummary()
function.
If your data was generated through randomized experiments (or you are using natural experiments), then datasummary_balance()
can be very useful as it can generate a variable balance table.
Syntax:
variables to summarize
: list of variables to summarizetreatment dummy
: a dummy variable that indicates whether in the treated or control groupExample:
Kansas (N=534) | Nebraska (N=1268) | |||
---|---|---|---|---|
Mean | Std. Dev. | Mean | Std. Dev. | |
corn_yield | 173.1 | 24.3 | 181.7 | 21.3 |
soy_yield | 50.7 | 7.3 | 55.8 | 7.1 |
d0_5_9 | 3.7 | 3.8 | 4.0 | 3.9 |
d1_5_9 | 3.0 | 4.2 | 3.3 | 4.2 |
d2_5_9 | 2.6 | 4.0 | 2.8 | 4.6 |
d3_5_9 | 1.6 | 3.4 | 1.5 | 3.5 |
d4_5_9 | 0.7 | 2.4 | 0.3 | 1.3 |
You can create a correlation table with datasummary_correlation()
.
county_yield %>%
dplyr::filter(state_name %in% c("Nebraska", "Kansas")) %>%
dplyr::select(c(state_name, where(is.numeric))) %>%
dplyr::select(- year) %>%
modelsummary::datasummary_correlation()
corn_yield | soy_yield | d0_5_9 | d1_5_9 | d2_5_9 | d3_5_9 | d4_5_9 | |
---|---|---|---|---|---|---|---|
corn_yield | 1 | . | . | . | . | . | . |
soy_yield | .71 | 1 | . | . | . | . | . |
d0_5_9 | .13 | .04 | 1 | . | . | . | . |
d1_5_9 | -.13 | -.12 | .05 | 1 | . | . | . |
d2_5_9 | -.24 | -.21 | -.28 | .38 | 1 | . | . |
d3_5_9 | -.20 | -.12 | -.30 | -.02 | .29 | 1 | . |
d4_5_9 | -.22 | -.04 | -.18 | -.04 | .02 | .34 | 1 |