6 Rmarkdown-WORD
6.1 Before you start
Before delving into this chapter, carefully consider the time and effort required to master these skills. Ensure the investment aligns with your goals and priorities.
6.2 Preparation
Before diving in, please do the followings:
Go here and download all the files including sample_to_word.rmd, which we refer to as the sample rmd file throughout this chapter.
Alternatively, you can clone this Github repository and go to the Resources/WORD-Rmarkdown/ folder.
Open RStudio (or any other software you you may be using like VS code) and knit sample_to_word.rmd to produce sample_to_word.docx, which we refer to as the sample WORD file.
Install the following packages if you have not
officedown
flextable
officer
knitr
Rmarkdown
tidyverse
modelsummary
6.3 Basic workflow until publication
Here are the proposed steps to generate a fully publication-ready article:
- Write an Rmarkdown file and knit to a WORD file without worrying about formatting your manuscript (Most journals do not require manuscript formatting until the manuscript is accepted. But, please do check yourself if it is required)
- Iterate between the authors until ready for submission
- Submit your manuscript (no formatting just yet)
- Revise and resubmit your manuscript after iterating between the authors (the same as step 2)
- Repeat steps 2, 3, and 4 until accepted (terminate the process if rejected)
- Finally, format your manuscript to the specific requirements by the journal
- create a style reference WORD file and refer to it (see Section Section 6.6).
- modify the format manually on WORD if that is faster
Now, let’s take a closer look at Step 2. What you do in this step differs based on who your co-authors are.
6.4 Setting up an Rmarkdown file for WORD output
An Rmarkdown file starts with a YAML header, which lets you specify things like
- paper title
- authors (with affiliations and other subsidiary information)
- date
- abstract
It also lets you specify various aspects of the output word file including
- whether to include table of contents (
toc
) - the depth of table of contents (
toc_depth
) - whether to number sections or not (
number_sections
) - how to display plots and tables (
plots
,tables
)- alignment (
align
) - prefix (
pre
), etc
- alignment (
This is also where you specify what files you use as bibliography, citation style, among other things. Important ones will be introduced later.
Here is an example YAML header, which you can see in sample_to_word.rmd
file.
Note that we are using officedown::rdocx_document
from the officedown
package with the base format being word_document2
from the bookdown
package as a way to compile to WORD. Other alternatives include word_document
from the Rmarkdown
package or word_document2
from the bookdown
package. After some experimentations with these three options, I found that this option is the most complete. See the officedown
chapter of the officeverse book written by the author of the officedown
package.
6.5 Essential markdown syntaxes for WORD
Here are the essential markdown syntaxes you should know. For other syntaxes such as creating a list, please refer to section Section 5.6.
Section, subsection, subsubsections, …
You can define sections, subsections, and subsubsections by using #, ##, and ### respectively at the beginning of a line.
# This becomes section title
## This becomes subsection title
### This becomes subsubsection title
Footnote
You can add a footnote using ^[]
like this:
regular texts^[this is a footnote]
Footnotes are automatically numbered.
Page break
For introducing a page break, you can place the following R chunk.
`r run_pagebreak()`
6.6 Specifying the style
You can change the style of the output WORD file either via direct control in the YAML header or via reference WORD file.
6.6.1 Change styles in the YAML header
You can control the style of the output WORD file somewhat in the YAML header. For example, you can determine the page size and margins using page_size
and page_margins
(see the YAML header in the sample rmd file). You also have some controls on how tables and figures appear. For example, the align
option lets you determine the horizontal position of tables and figures. topcaption
lets you determine whether you have the caption at the top of bottom.
6.6.2 Change styles through reference docx
It is almost always the case that the above approach is not sufficient to format your WORD file to the specific requirements of your target journal. In such a case, you can define the style of the output WORD file in detail using a style reference WORD file. This method can control virtually every aspect of WORD file styling. Some of the commonly customized elements among others are
- font size
- font family
- line spacing
To do this, you first create a style reference word file that follows the style and format you would like the output WORD file to have. Then, add the following to the YAML header under output: like below.
output:
reference_docx: word-style.docx
In our example project, word_template.docx is the reference file.
Of course, only the style and format of the reference WORD file will be inherited to the output WORD file, but not its contents.
You can change the style of the reference WORD style file and save the changes. Then, the style changes will be reflected in the output WORD file when the rmd file is knitted next time.
6.6.3 Style change example
This section gives you a quick look at what it may look to be changing the style of a reference WORD file. We use word_template.docx for this demonstration. First, open the document on WORD and open the Styles Pane (If you are using Mac, it should be at the right upper corner of the document). Then you should see something like below.
Then, put your cursor on anywhere in line 2 as below and you should see that the current style is changed to “1 Heading 1”, which is the name of the class (style type) the texts in line 2 belong to.
Now, put your cursor at the black triangle and you should see available options including “Modify Style…”
Once you click on it, you should see a pop-up window like below.
Here, you can change font family, font size, among other things. In this demonstration, let’s get rid of automatic section number. To do so, click on the “Format” button at the lower left corner of the window. Then, you should see this.
Pick “Numbering,” click on “None,” and hit “OK.”
Hit “OK” again in the previous pop-up window, then you should see this.
Notice that section numbering is now gone. Now, knit the sample rmd file and confirm that the output WORD file indeed lost section numbers (subsection numbers still remain because you did not modify that part in the reference docx.).
6.7 Citations and References
6.7.1 Set up
First, create a reference file. Then, add the following to the YAML header (not under output:).
: bibliography file name bibliography
There are various bibliography systems that can be used including BibLaTeX/BibTex (.bib), CSL-JSON (.json), EndNote (.enl) among other.
Then, add the following to the part of rmd where you want to put references.
::: {#refs} :::
In our example, we use a bib file and the bibliography file is named bibliography.bib and specified in the YAML header as below
6.7.2 Cite and create references
To cite, use the following syntax:
@reference_name
to print “author names (year)” in the output WORD file[@reference_name]
to print “(author names, year)” in the output WORD file[@reference_name_1; @reference_name_2]
to print “(author names, year; author names, year)” in the output WORD file[-@reference_name]
to print just year
reference_name
is the very first entry of a .bib file as in
If you are using CSL json file, then it is the id of an entry as in
The cited items are automatically added to the reference following the specified style (see the next section).
6.7.3 Citation and Reference Style
You can change the citation and reference style using Citation Style Language. Citation style files have .csl extension.
Obtain the csl file you would like to use from the Zotero citation style repository.
Place the following in the YAML header (not under output:):
: csl file name csl
- Then, when knitted, citations and references styles reflect the style specified by the csl file
Currently, the csl style should be set to qje.csl (citation style language for The Quarterly Journal of Economics) as below
Citation and references styles in the output WORD file follows the rules for the QJE.
Now, comment csl: qje.csl
and uncomment csl: pnas.csl
so that the CSL for the Proceedings of the National Academy of Sciences (PNAS) is used, and then knit the sample rmd file.
You can now see that the citation style no longer respects the rules I mentioned above and also the reference style follows that of PNAS. This is because PNAS uses only numbers, but not author names or years.
6.8 Tables (Cross-referenced)
Create a table using the
flextable
package (this is not the only option, and will be discussed later)Add an R code chunk like this:
```{r, tab.id = "table-id", tab.cap = "table-name"}
table_ft
```
table_ft
is aflextable
object.table-name
is the caption of the table in the output WORD filetable-id
is the table id you can use to cross-reference
- Use
\@ref(tab:table-id)
in the Rmarkdown file to cross-reference the table (table numbering in the output WORD file is automatic)
6.8.1 Packages to create tables
One of the disadvantages of writing to a WORD file is that some table-making R packages are not compatible with it. The flextable
package is written by the same author of the officedown
package, which we are using to write to a WORD file. So, naturally, a table object created by the flextable
package (a flextable
object) can be knitted into the output WORD without (almost) any hiccups as we saw earlier. A notable R package that does NOT work well with the output class of officedown::rdocx_document
(which we are using) is the gt
package. Tables created by the package is an object of class gt
. Unfortunately, there is no function that lets you convert an gt
object to a flextable
object as of now.
One of the recommended packages is the modelsummary
package (especially for those who often include regression results tables and summary statistics tables). It lets you create regression results tables via the modelsummary()
function and summary statistics tables via the datasummary()
function1. Both functions has an option called output
and you can use output = "flextable"
to generate tables as flextable
objects, which can then be included in the output file easily.
Here are some example R codes of using the modelsummary
package to create a regression results and summary statistics tables.
Regression table
#--- regressions ---#
<- fixest::feols(mpg ~ hp + cyl, data = mtcars)
lm_1 <- fixest::feols(mpg ~ hp + cyl + wt, data = mtcars)
lm_2 <- fixest::feols(mpg ~ hp + cyl + wt, cluster = ~ vs, data = mtcars)
lm_3
#--- create a regression results table ---#
::modelsummary(
modelsummarylist(lm_1, lm_2, lm_3),
output = "flextable",
gof_omit = "IC|Log|Adj|F|Pseudo|Within"
%>%
) autofit() %>%
hline(8) # add horizontal line. modelsummary() adds horizontal line separating coefficient estimates and model summary statistics. But, it disappears on WORD for some reason. So, it is manually added here.
Summary statistics table
::datasummary(
modelsummary+ hp + cyl ~ Mean + SD,
mpg data = mtcars,
output = "flextable"
)
Here is one of the resources to learn how to use the flextable
package if you need to further modify the tables created by the modelsummary()
and datasummary()
functions.
6.9 Figures (Cross-referenced)
6.9.1 Figures created internally
You can create plots within an Rmarkdown file and display them in the output WORD file. Here are the steps.
Create a plot using R
Add an R code chunk like this:
figure_g
is a plot.fig.cap = "caption"
addscaption
as the caption of the figurefigure-id
is the figure id used for cross-referencing
```{r, fig.id = "figure-id", fig.cap = "caption"}
figure_g
```
- Use
\@ref(fig:figure-id)
in the Rmarkdown file to cross-reference the figure (figure numbering in the output WORD file is automatic)
You can control the size of the plots in the output WORD file, using the fig.width
and fig.height
options in the R code chunk. For example, fig.width = 4
would mean that the width of the plot will be 4 inches. Use the dpi
option to control the resolution of the plot. The higher the dpi
value, the sharper the plot is.
6.9.2 Importing pre-made figures
Instead of creating plots using R code inside an Rmarkdown, we often need to import figures that were made elsewhere. You may be importing your company/university logos. You may have created plots using the ggplot2
package and saved them as pictures. In order to import a pre-made figure and cross-reference it, you can use knitr::include_graphics()
as follows,
```{r echo = F, fig.id = "figure-id", fig.cap = "figure caption"}
knitr::include_graphics("file name") ```
You can cross-reference imported figures in the same manner as the R-generated figures as shown above.
It is important to note that pdf files are not accepted2. One of the accepted files types is .png
(or jpg
)3. So, if you are creating figures outside of the Rmarkdown file, save it as a png
file.
In order to change the size of the imported figure, you can use the same R chunk option of fig.width
and fig.height
as the internally created plots. However, you cannot control the resolution of the imported figure using dpi
(naturally) because the resolution of the saved image will be respected. If you are using R to create a plot, you can set its dpi when saving it. For example, if you are using ggplot2
, you can the dpi
option in ggsave()
will do the job.
6.10 Mathematical equations
You can use Latex-like math syntax to write mathematical equations. For equation numbering and cross-referencing to work as discussed here, make sure that you use bookdown::word_document2
for the base_format
in the YAML header as below.
6.10.1 Math equation
Math for WORD output is much more limited compared to writing to a PDF file using Latex (when knitting to a PDF file, Rmarkdown uses Latex to render math equations). This is because Latex is NOT involved in converting Latex-like math syntax to math equations when knitted to WORD. Instead, pandoc is doing the conversion job. So, it is not surprising that not all the Latex math capabilities can be utilized when knitting to a WORD file. With that said, most of the syntaxes that you will need are the same between Rmd-to-WORD and Latex. You can use the equation (for single line of math) and align (more than equal to one equations) environments with successful cross-referencing.
equation environment
To use an equation environment, first write math and then put (\#eq:equation-id)
at the end (but before \end{equation}
) to give an equation id to the equation. You can use \@ref(eq:equation-id)
to cross-reference the equation.
\begin{equation}
Math\#eq:equation-id)
( \end{equation}
For example,
\begin{equation}
\bar{y} = \sum_{i=1}^n y_i\#eq:eq-1)
( \end{equation}
should print like below in the output WORD file.
\[ \begin{equation} \bar{y} = \sum_{i=1}^n y_i \end{equation} \]
align environment
This works just like equation environment. To use an align environment, first write line(s) of math and then put (\#eq:equation-id)
at the end (but before \end{align}
) to give an equation id to the equation. You can use \@ref(eq:equation-id)
to cross-reference the equation.
\begin{align}\\
Math
Math\#eq:equation-id)
( \end{align}
For example,
\begin{align}\\
AR(p): Y_i &= c + \epsilon_i + \phi_i Y_{i-1} \dots
Y_{i} &= c + \phi_i Y_{i-1} \dots \end{align}
should print like below in the output WORD file.
\[ \begin{align} AR(p): Y_i &= c + \epsilon_i + \phi_i Y_{i-1} \dots \\ Y_{i} &= c + \phi_i Y_{i-1} \dots \end{align} \]
6.10.2 In-line math
To write a mathematical expression in line, you can enclose math expressions by $ like below.
Our model is written as $Y_z = f_z(S) + g_z(N) + h_z(X,Y) + \varepsilon_z$.
This should appear like below in the output WORD file.
Our model is written as \(Y_z = f_z(S) + g_z(N) + h_z(X,Y) + \varepsilon_z\).