modelsummaryClick on the three horizontally stacked lines at the bottom left corner of the slide, then you will see table of contents, and you can jump to the section you want
Hit letter “o” on your keyboard and you will have a panel view of all the slides
modelsummary packageWe use county_yield throughout this lecture.
First install the r.spatial.workshop.datasets package.
Then, get the data:
# A tibble: 1,956 × 10
corn_yield soy_yield year county_code state_name d0_5_9 d1_5_9 d2_5_9 d3_5_9
<dbl> <dbl> <int> <chr> <chr> <dbl> <dbl> <dbl> <dbl>
1 123 42 2000 053 Kansas 2.49 2.87 0.134 0
2 188. NA 2017 095 Kansas 8.72 0 0 0
3 169. 58.4 2016 095 Kansas 1 0 0 0
4 198. NA 2015 095 Kansas 1.76 1.21 2.09 0
5 152. NA 2012 095 Kansas 6.28 1.47 9.54 4.46
6 170 42 2007 095 Kansas 0 0 0 0
7 193 49 2005 095 Kansas 4.32 0 0 0
8 173 47 2003 095 Kansas 2.29 5.16 4.46 1.09
9 165 40 2002 095 Kansas 3.71 1.48 1.90 0
10 171 52 2001 095 Kansas 9.88 0.188 0 0
# ℹ 1,946 more rows
# ℹ 1 more variable: d4_5_9 <dbl>
Variable Definitions
soy_yield: soybean yield (bu/acre)corn_yield: corn yield (bu/acre)d0_5_9: ratio of weeks under drought severity of 0 from May to Septemberd1_5_9: ~ drought severity of 1 from May to Septemberd2_5_9: ~ drought severity of 2 from May to Septemberd3_5_9: ~ drought severity of 3 from May to Septemberd4_5_9: ~ drought severity of 4 from May to SeptemberLet’s first run regressions which we are going to report in tables.
model_1_corn <- lm(corn_yield ~ d1_5_9 + d2_5_9, data = county_yield)
model_2_corn <- lm(corn_yield ~ d1_5_9 + d2_5_9 + d3_5_9 + d4_5_9, data = county_yield)
model_1_soy <- lm(soy_yield ~ d1_5_9 + d2_5_9, data = county_yield)
model_2_soy <- lm(soy_yield ~ d1_5_9 + d2_5_9 + d3_5_9 + d4_5_9, data = county_yield)Get White-Huber robust variance-covariance matrix for the regressions:
You can supply a list of regression results to modelsummary::msummary() to create a default regression table.
| (1) | (2) | (3) | (4) | |
|---|---|---|---|---|
| (Intercept) | 181.978 | 183.882 | 56.049 | 56.202 |
| (0.678) | (0.690) | (0.288) | (0.295) | |
| d1_5_9 | -0.216 | -0.367 | -0.062 | -0.069 |
| (0.135) | (0.133) | (0.055) | (0.055) | |
| d2_5_9 | -1.081 | -0.836 | -0.327 | -0.298 |
| (0.124) | (0.129) | (0.053) | (0.055) | |
| d3_5_9 | -0.754 | -0.173 | ||
| (0.158) | (0.090) | |||
| d4_5_9 | -2.194 | -0.137 | ||
| (0.320) | (0.213) | |||
| Num.Obs. | 1956 | 1956 | 1100 | 1100 |
| R2 | 0.050 | 0.099 | 0.047 | 0.052 |
| R2 Adj. | 0.049 | 0.097 | 0.046 | 0.049 |
| AIC | 17806.4 | 17708.0 | 7475.7 | 7474.2 |
| BIC | 17828.8 | 17741.4 | 7495.8 | 7504.2 |
| Log.Lik. | -8899.218 | -8847.985 | -3733.873 | -3731.078 |
| F | 51.768 | 53.480 | 27.207 | 15.043 |
| RMSE | 22.89 | 22.30 | 7.21 | 7.19 |
modelsummary::msummary() offers multiple options to modify the default regression table to your liking:
title: put a title to the tablestars: place significance symbols (and modify the symbol placement rules)coef_map: change the order and label of variable namesnotes: add footnotesfmt: change the format of numbersstatistic: type of statistics you display along with coefficient estimatesgof_map: define which model statistics to displaygof_omit: define which model statistics to omit from the default selection of model statisticsadd_rows: add rows of arbitrary contents to the tableAdd stars = TRUE in modelsummary::msummary() to add significance markers.
You can modify significance levels and markers by supplying a named vector with its elements being the significance levels and their corresponding names being the significance markers.
Example:
| (1) | |
|---|---|
| + p < 0.1, &+ p < 0.05, +*+ p < 0.01 | |
| (Intercept) | 181.978+*+ |
| (0.678) | |
| d1_5_9 | -0.216 |
| (0.135) | |
| d2_5_9 | -1.081+*+ |
| (0.124) | |
| Num.Obs. | 1956 |
| R2 | 0.050 |
| R2 Adj. | 0.049 |
| AIC | 17806.4 |
| BIC | 17828.8 |
| Log.Lik. | -8899.218 |
| F | 51.768 |
| RMSE | 22.89 |
coef_map allows you to reorder coefficient rows and change their labels.
Similarly with the stars option, you supply a named vector where its names are the existing labels and their corresponding elements are the new labels.
In the table, the coefficient rows are placed in the order they are ordered in the named vector.
#--- define a coef_map vector ---#
coef_map_vec <- c(
"d1_5_9" = "DI: category 1",
"d2_5_9" = "DI: category 2",
"d3_5_9" = "DI: category 3",
"d4_5_9" = "DI: category 4",
"(Intercept)" = "Constant"
)
#--- create a table ---#
modelsummary::msummary(
list(model_2_corn, model_2_soy),
coef_map = coef_map_vec
)| (1) | (2) | |
|---|---|---|
| DI: category 1 | -0.367 | -0.069 |
| (0.133) | (0.055) | |
| DI: category 2 | -0.836 | -0.298 |
| (0.129) | (0.055) | |
| DI: category 3 | -0.754 | -0.173 |
| (0.158) | (0.090) | |
| DI: category 4 | -2.194 | -0.137 |
| (0.320) | (0.213) | |
| Constant | 183.882 | 56.202 |
| (0.690) | (0.295) | |
| Num.Obs. | 1956 | 1100 |
| R2 | 0.099 | 0.052 |
| R2 Adj. | 0.097 | 0.049 |
| AIC | 17708.0 | 7474.2 |
| BIC | 17741.4 | 7504.2 |
| Log.Lik. | -8847.985 | -3731.078 |
| F | 53.480 | 15.043 |
| RMSE | 22.30 | 7.19 |
coef_omit() lets you omit coefficient rows from the default selections.
You supply a vector of strings (and/or regular expressions), and coefficient rows that match the string pattern will be omitted.
Example
d2 matches with d2_5_9, and rows associated with the coefficients on d2_5_9 are removed.
| (1) | (2) | |
|---|---|---|
| (Intercept) | 183.882 | 56.202 |
| (0.690) | (0.295) | |
| d1_5_9 | -0.367 | -0.069 |
| (0.133) | (0.055) | |
| d3_5_9 | -0.754 | -0.173 |
| (0.158) | (0.090) | |
| d4_5_9 | -2.194 | -0.137 |
| (0.320) | (0.213) | |
| Num.Obs. | 1956 | 1100 |
| R2 | 0.099 | 0.052 |
| R2 Adj. | 0.097 | 0.049 |
| AIC | 17708.0 | 7474.2 |
| BIC | 17741.4 | 7504.2 |
| Log.Lik. | -8847.985 | -3731.078 |
| F | 53.480 | 15.043 |
| RMSE | 22.30 | 7.19 |
gof_omit() lets you omit model statistics like \(R^2\) from the default selections.
You supply a vector of strings (and/or regular expressions), and statistics that match the string pattern will be omitted.
Example
IC matches with AIC and BIC, and Adj matches with R2 Adj
| (1) | (2) | |
|---|---|---|
| (Intercept) | 183.882 | 56.202 |
| (0.690) | (0.295) | |
| d1_5_9 | -0.367 | -0.069 |
| (0.133) | (0.055) | |
| d2_5_9 | -0.836 | -0.298 |
| (0.129) | (0.055) | |
| d3_5_9 | -0.754 | -0.173 |
| (0.158) | (0.090) | |
| d4_5_9 | -2.194 | -0.137 |
| (0.320) | (0.213) | |
| Num.Obs. | 1956 | 1100 |
| R2 | 0.099 | 0.052 |
| Log.Lik. | -8847.985 | -3731.078 |
| F | 53.480 | 15.043 |
| RMSE | 22.30 | 7.19 |
add_rows() can be used to insert arbitrary rows into a table. Adding rows using add_rows() is a two-step process:
data.frame (or tibble) to insert#--- create a table (data.frame) to insert ---#
(
rows <- data.frame(
c1 = c("FE: County", "FE: Year"),
c2 = c("Yes", "Yes"),
c3 = c("No", "Now")
)
) c1 c2 c3
1 FE: County Yes No
2 FE: Year Yes Now
data.frame by attr(data.frame, "position") <- row number.#--- tell where to insert ---#
attr(rows, "position") <- c(3, 4)
#--- create a table with rows inserted ---#
modelsummary::msummary(
list(Moddel1 = model_2_corn, Model2 = model_2_soy),
gof_omit ="IC|Adj",
coef_omit = "d",
add_row = rows #<<
)| Moddel1 | Model2 | |
|---|---|---|
| (Intercept) | 183.882 | 56.202 |
| (0.690) | (0.295) | |
| FE: County | Yes | No |
| FE: Year | Yes | Now |
| Num.Obs. | 1956 | 1100 |
| R2 | 0.099 | 0.052 |
| Log.Lik. | -8847.985 | -3731.078 |
| F | 53.480 | 15.043 |
| RMSE | 22.30 | 7.19 |
It is often the case that we replace the default variance-covariance matrix with a robust one for valid statistical testing.
You can achieve this using the statistic_override option. You will give a list of variance-covariance matrices in the order their corresponding regression results appear on the table.
Syntax:
Default:
modelsummary::msummary(
list(Moddel1 = model_2_corn, Model2 = model_2_soy),
gof_omit = "IC|R",
coef_omit = "d3|d4"
#--- no statistical override ---#
) | Moddel1 | Model2 | |
|---|---|---|
| (Intercept) | 183.882 | 56.202 |
| (0.690) | (0.295) | |
| d1_5_9 | -0.367 | -0.069 |
| (0.133) | (0.055) | |
| d2_5_9 | -0.836 | -0.298 |
| (0.129) | (0.055) | |
| Num.Obs. | 1956 | 1100 |
| Log.Lik. | -8847.985 | -3731.078 |
| F | 53.480 | 15.043 |
VCOV swapped:
modelsummary::msummary(
list(Moddel1 = model_2_corn, Model2 = model_2_soy),
gof_omit = "IC|R",
coef_omit = "d3|d4",
statistic_override = list(vcov_2_corn, vcov_2_soy)
) | Moddel1 | Model2 | |
|---|---|---|
| (Intercept) | 183.882 | 56.202 |
| (0.690) | (0.295) | |
| d1_5_9 | -0.367 | -0.069 |
| (0.133) | (0.055) | |
| d2_5_9 | -0.836 | -0.298 |
| (0.129) | (0.055) | |
| Num.Obs. | 1956 | 1100 |
| Log.Lik. | -8847.985 | -3731.078 |
| F | 53.480 | 15.043 |
You can save the table to a file by providing a file name to the output option.
The supported file types are:
Example:
The docx option may be particularly useful for those who want to put finishing touches on the table manually on WORD:
Using the output option in modelsummary::msummary(), you can turn the regression table into R objects that are readily modifiable by the gt, kableExtra, and flextable packages.
Example: flextable
#--- create a regression table and turn it into a gt_tbl ---#
reg_table_ft <- list(model_1_corn, model_1_soy)%>%
modelsummary::msummary(output = "flextable")
#--- check the class ---#
class(reg_table_ft)[1] "flextable"
Example: gt
Now that the regression table created using modelsummary::msummary() with output = "flextable" is a flextable object.
So, we can use our knowledge of the flextable package to further modify the regression table if you would like.
For the details of how to use the flextable package visit the flextable lecture notes.
Here I will just given you an example of the use of flextable operations.
Example
| Corn 1 | Corn 2 | Soy 1 | Soy 2 |
|---|---|---|---|---|
(Intercept) | 181.978 | 183.882 | 56.049 | 56.202 |
(0.678) | (0.690) | (0.288) | (0.295) | |
d1_5_9 | -0.216 | -0.367 | -0.062 | -0.069 |
(0.135) | (0.133) | (0.055) | (0.055) | |
d2_5_9 | -1.081 | -0.836 | -0.327 | -0.298 |
(0.124) | (0.129) | (0.053) | (0.055) | |
d3_5_9 | -0.754 | -0.173 | ||
(0.158) | (0.090) | |||
d4_5_9 | -2.194 | -0.137 | ||
(0.320) | (0.213) | |||
Num.Obs. | 1956 | 1956 | 1100 | 1100 |
R2 | 0.050 | 0.099 | 0.047 | 0.052 |
Log.Lik. | -8899.218 | -8847.985 | -3733.873 | -3731.078 |
F | 51.768 | 53.480 | 27.207 | 15.043 |
RMSE | 22.89 | 22.30 | 7.21 | 7.19 |
Now that the regression table is a gt_tbl object, we can use our knowledge of the gt package to modify the regression table.
For the details of how to use the gt package go here. Here I will just given you an example of the use of gt operations.
Example
list(
"Corn 1" = model_1_corn,
"Corn 2" = model_2_corn,
"Soy 1" = model_1_soy,
"Soy 2" = model_2_soy
) %>%
modelsummary::msummary(
output = "gt",
gof_omit ="IC|Adj",
) %>%
gt::tab_spanner( #<<
label = "Corn", #<<
columns = vars("Corn 1", "Corn 2") #<<
) %>% #<<
gt::tab_style( #<<
style = cell_text(color = 'red'), #<<
locations = cells_body(rows = 7:8) #<<
) #<<
Corn
|
Soy 1 | Soy 2 | ||
|---|---|---|---|---|
| Corn 1 | Corn 2 | |||
| (Intercept) | 181.978 | 183.882 | 56.049 | 56.202 |
| (0.678) | (0.690) | (0.288) | (0.295) | |
| d1_5_9 | -0.216 | -0.367 | -0.062 | -0.069 |
| (0.135) | (0.133) | (0.055) | (0.055) | |
| d2_5_9 | -1.081 | -0.836 | -0.327 | -0.298 |
| (0.124) | (0.129) | (0.053) | (0.055) | |
| d3_5_9 | -0.754 | -0.173 | ||
| (0.158) | (0.090) | |||
| d4_5_9 | -2.194 | -0.137 | ||
| (0.320) | (0.213) | |||
| Num.Obs. | 1956 | 1956 | 1100 | 1100 |
| R2 | 0.050 | 0.099 | 0.047 | 0.052 |
| Log.Lik. | -8899.218 | -8847.985 | -3733.873 | -3731.078 |
| F | 51.768 | 53.480 | 27.207 | 15.043 |
| RMSE | 22.89 | 22.30 | 7.21 | 7.19 |
county_yield %>%
dplyr::filter(year %in% 2010:2012) %>%
modelsummary::datasummary(
(Year = factor(year)) * (
(`Corn Yield (bu/acre)` = corn_yield) +
(`Soy Yield (bu/acre)` = soy_yield) +
(`DI: category 4` = d4_5_9)
) ~
state_name * (Mean + SD) ,
data = .
) | Colorado | Kansas | Nebraska | |||||
|---|---|---|---|---|---|---|---|
| Year | Mean | SD | Mean | SD | Mean | SD | |
| 2010 | Corn Yield (bu/acre) | 196.08 | 12.96 | 182.38 | 17.12 | 182.37 | 14.80 |
| Soy Yield (bu/acre) | 58.79 | 4.30 | |||||
| DI: category 4 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | |
| 2011 | Corn Yield (bu/acre) | 186.25 | 12.76 | 160.56 | 29.69 | 178.32 | 16.00 |
| Soy Yield (bu/acre) | 60.35 | 5.39 | |||||
| DI: category 4 | 0.00 | 0.00 | 1.52 | 3.33 | 0.00 | 0.00 | |
| 2012 | Corn Yield (bu/acre) | 160.50 | 31.69 | 161.33 | 17.44 | 185.91 | 18.44 |
| Soy Yield (bu/acre) | 59.80 | 5.21 | |||||
| DI: category 4 | 1.79 | 1.60 | 6.16 | 3.59 | 3.05 | 2.65 | |
modelsummary::datasummary()Syntax:
formula has two sides separated by ~ just like formula for regression.
Variables/statistics on the left-hand side (right-hand side) comprise rows (columns).
The modelsummary package offers multiple summary functions of its own:
MeanSDMinMaxP0, P25, P50, P75, P100HistogramThese functions have na.rm = NA hidden inside it, so they avoid having NA when simply applying their counterparts from the base package.
You can use a user-defined function that takes a vector of values and return a single value.
Example:
You can add more variables and statistics using +.
Example:
modelsummary::datasummary(
corn_yield + soy_yield + d0_5_9 + d1_5_9
~ Mean + SD+ MinMax + Histogram,
data = county_yield
)| Mean | SD | MinMax | Histogram | |
|---|---|---|---|---|
| corn_yield | 178.25 | 23.50 | [0, 234.3] | ▁▄▇▆▁ |
| soy_yield | 54.95 | 7.39 | [15, 74.3] | ▁▄▇▆▃▁ |
| d0_5_9 | 3.92 | 3.94 | [0, 21.3569] | ▇▃▃▂▁ |
| d1_5_9 | 3.15 | 4.15 | [0, 21.4838] | ▇▁▁▁▁ |
For each of the variables on the left-hand side, each of the statistics on the right-hand side is calculated and displayed.
You can use All() to create a summary table for all the numeric variables in the dataset.
At the moment, All() does not work on tibble. So, if your dataset is tibble, convert it to a data.frame on the fly in the code like below:
Example:
tablesummary()You can nest categorical variables with *, meaning you can get summary statistics for each value of the categorical variable (like group_by() %>% summarize()).
Syntax
Examples:
modelsummary::datasummary(
corn_yield + soy_yield + d0_5_9 + d1_5_9
~ state_name * (Mean + SD) + MinMax,
data = county_yield
)| Colorado | Kansas | Nebraska | |||||
|---|---|---|---|---|---|---|---|
| Mean | SD | Mean | SD | Mean | SD | MinMax | |
| corn_yield | 168.26 | 30.64 | 173.06 | 24.32 | 181.65 | 21.32 | [0, 234.3] |
| soy_yield | 50.74 | 7.34 | 55.80 | 7.11 | [15, 74.3] | ||
| d0_5_9 | 4.23 | 4.67 | 3.69 | 3.81 | 3.97 | 3.89 | [0, 21.3569] |
| d1_5_9 | 2.66 | 3.52 | 2.96 | 4.19 | 3.28 | 4.20 | [0, 21.4838] |
For each value of state_name (Nebraska, Colorado, Kansas), Mean and SD are shown for each of the variables on the left-hand side. But, MinMax is for the entire sample.
You can nest with multiple categorical variables by multiplying stats with multiple categorical variables.
Example:
county_yield %>%
dplyr::filter(year %in% 2011:2012) %>%
dplyr::filter(state_name %in% c("Kansas", "Nebraska")) %>%
modelsummary::datasummary(
corn_yield + soy_yield + d0_5_9 + d1_5_9
~ factor(year) * state_name * (Mean + SD) + MinMax,
data = .
)| 2011 | 2012 | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Kansas | Nebraska | Kansas | Nebraska | ||||||
| Mean | SD | Mean | SD | Mean | SD | Mean | SD | MinMax | |
| corn_yield | 160.56 | 29.69 | 178.32 | 16.00 | 161.33 | 17.44 | 185.91 | 18.44 | [100, 217] |
| soy_yield | 60.35 | 5.39 | 59.80 | 5.21 | [48, 70.3] | ||||
| d0_5_9 | 3.52 | 3.18 | 2.86 | 2.01 | 2.15 | 1.25 | 3.11 | 1.34 | [0, 8.7386] |
| d1_5_9 | 5.05 | 3.28 | 0.01 | 0.05 | 2.62 | 1.17 | 2.74 | 1.39 | [0, 10.1494] |
For each of the unique combinations of state_name (Nebraska, Kansas) and year (2011, 2012), Mean and SD are shown for each of the variables on the left-hand side. But, MinMax is for the entire sample.
By default variable and statistics names are used as the labels in the table.
You can provide labels by the following syntax: (label = variable/stat)
Example:
county_yield %>%
dplyr::filter(year %in% 2011:2012) %>%
dplyr::filter(state_name %in% c("Kansas", "Nebraska")) %>%
modelsummary::datasummary(
(`Corn Yield (bu/acre)` = corn_yield)
~ state_name * (Mean + (Std.Dev. = SD)),
data = .
)| Kansas | Nebraska | |||
|---|---|---|---|---|
| Mean | Std.Dev. | Mean | Std.Dev. | |
| Corn Yield (bu/acre) | 160.99 | 23.31 | 181.95 | 17.56 |
corn_yield is labeled as Corn Yield (bu/acre)SD is labeled as Std.Dev..content-box-red[Note: when you have spaces in the label, surround the label with back quotes.]
If you do not like this way of changing labels, you can always use gt package.
You can pass option arguments to the stats function by: stat * Argument(options)
Example:
county_yield %>%
dplyr::filter(year %in% 2011:2012) %>%
dplyr::filter(state_name %in% c("Kansas", "Nebraska")) %>%
modelsummary::datasummary(
corn_yield
~ state_name * (mean + sd) * Arguments(na.rm = TRUE) +
quantile * Arguments(prob = 0.1, na.rm = TRUE),
data = .
)| Kansas | Nebraska | ||||
|---|---|---|---|---|---|
| mean | sd | mean | sd | quantile | |
| corn_yield | 160.99 | 23.31 | 181.95 | 17.56 | 148.52 |
(mean + sd) * Arguments(na.rm = TRUE) adds na.rm = TRUE option to mean() and sd()quantile * Arguments(prob = 0.1, na.rm = TRUE) adds prob = 0.1 and na.rm = TRUE to quantileo()Example
county_yield %>%
dplyr::filter(year %in% 2011:2012) %>%
dplyr::filter(state_name %in% c("Kansas", "Nebraska")) %>%
modelsummary::datasummary(
corn_yield
~ state_name * (mean + sd) * Arguments(na.rm = TRUE) +
quantile * Arguments(prob = 0.1, na.rm = TRUE),
data = .,
title = "A title",
notes = c("first note", "second note")
)| Kansas | Nebraska | ||||
|---|---|---|---|---|---|
| mean | sd | mean | sd | quantile | |
| first note | |||||
| second note | |||||
| corn_yield | 160.99 | 23.31 | 181.95 | 17.56 | 148.52 |
You can use align to align columns. Available alignment are:
l: leftr: rightc: centerInside align(), you provide a sequence of the option letters (e.g., "lrcle")
The nth letter corresponds to nth column.
Example:
county_yield %>%
dplyr::filter(year %in% 2011:2012) %>%
dplyr::filter(state_name %in% c("Kansas", "Nebraska")) %>%
modelsummary::datasummary(
corn_yield
~ state_name * (`This is M E A N` = mean) * Arguments(na.rm = TRUE) +
(`This is Q U A N T I L E` = quantile) * Arguments(prob = 0.1, na.rm = TRUE),
data = .,
align = "lrlc"
)| Kansas | Nebraska | ||
|---|---|---|---|
| This is M E A N | This is M E A N | This is Q U A N T I L E | |
| corn_yield | 160.99 | 181.95 | 148.52 |
You can use the output option to either export the table as a file or save it as R objects which you can further modify.
This works exactly the same way as the modelsummary::msummary() function.
If your data was generated through randomized experiments (or you are using natural experiments), then datasummary_balance() can be very useful as it can generate a variable balance table.
Syntax:
variables to summarize: list of variables to summarizetreatment dummy: a dummy variable that indicates whether in the treated or control groupExample:
| Kansas (N=534) | Nebraska (N=1268) | |||
|---|---|---|---|---|
| Mean | Std. Dev. | Mean | Std. Dev. | |
| corn_yield | 173.1 | 24.3 | 181.7 | 21.3 |
| soy_yield | 50.7 | 7.3 | 55.8 | 7.1 |
| d0_5_9 | 3.7 | 3.8 | 4.0 | 3.9 |
| d1_5_9 | 3.0 | 4.2 | 3.3 | 4.2 |
| d2_5_9 | 2.6 | 4.0 | 2.8 | 4.6 |
| d3_5_9 | 1.6 | 3.4 | 1.5 | 3.5 |
| d4_5_9 | 0.7 | 2.4 | 0.3 | 1.3 |
You can create a correlation table with datasummary_correlation().
county_yield %>%
dplyr::filter(state_name %in% c("Nebraska", "Kansas")) %>%
dplyr::select(c(state_name, where(is.numeric))) %>%
dplyr::select(- year) %>%
modelsummary::datasummary_correlation()| corn_yield | soy_yield | d0_5_9 | d1_5_9 | d2_5_9 | d3_5_9 | d4_5_9 | |
|---|---|---|---|---|---|---|---|
| corn_yield | 1 | . | . | . | . | . | . |
| soy_yield | .71 | 1 | . | . | . | . | . |
| d0_5_9 | .13 | .04 | 1 | . | . | . | . |
| d1_5_9 | -.13 | -.12 | .05 | 1 | . | . | . |
| d2_5_9 | -.24 | -.21 | -.28 | .38 | 1 | . | . |
| d3_5_9 | -.20 | -.12 | -.30 | -.02 | .29 | 1 | . |
| d4_5_9 | -.22 | -.04 | -.18 | -.04 | .02 | .34 | 1 |