Preface

How is this book any different from other online books and resources?

We are seeing an explosion of online (and free) resources that teach how to use R for spatial data processing.1 Here is an incomplete list of such resources:

1 This phenomenon is largely thanks to packages like bookdown (Xie 2016), blogdown (Xie, Hill, and Thomas 2017), and pkgdown (Wickham and Hesselberth 2020) that has significantly lowered the cost of professional contents creation than before. Indeed, this book was built taking advantage of the bookdown package.

Thanks to all these resources, it has become much easier to self-teach R for GIS work than 10 years ago when I first started using R for GIS. Even though I have not read through all these resources carefully, I am pretty sure every topic found in this book can also be found somewhere in these resources (except the demonstrations). So, you may wonder why on earth you can benefit from reading this book. It all boils down to search costs. Researchers in different disciplines require different sets of spatial data skills. The available resources are typically very general covering so many topics, some of which economists are unlikely to use. It is particularly hard for those who do not have much experience in GIS to identify whether particular skills are essential or not. So, they could spend so much time learning something that is not really useful. The value of this book lies in its deliberate incomprehensiveness. It only packages materials that satisfy the need of most economists, cutting out many topics that are likely to be of limited use for economists.

For those who are looking for more comprehensive treatments of spatial data handling and processing in one book, I personally like Geocomputation with R a lot. Increasingly, the developer of R packages created a website dedicated to their R packages, where you can often find vignettes (tutorials), like Simple Features for R.

Topics covered in this book

The book starts with the very basics of spatial data handling (e.g., importing and exporting spatial datasets) and moves on to more practical spatial data operations (e.g., spatial data join) that are useful for research projects. Some parts of this books are still under development. Right now, Chapters 1 through 8, parts of Chapter 9, and Appendix A are available.

  • Chapter 1: Demonstrations of R as GIS
    • groundwater pumping and groundwater level
    • precision agriculture
    • land use and weather
    • corn planted acreage and railroads
    • groundwater pumping and weather
    • slave trade and economic development in Africa
    • terrain ruggedness and economic development in Africa
    • TseTse fly and economic developtment in Africa
  • Chapter 2: The basics of vector data handling using sf package
    • spatial data structure in sf
    • import and export vector data
    • (re)projection of spatial datasets
    • single-layer geometrical operations (e.g., create buffers, find centroids)
    • other miscellaneous basic operations
  • Chapter 3: Spatial interactions of vector datasets
    • understand topological relations of multiple sf objects
    • spatially subset a layer based on another layer
    • extracting values from one layer to another layer
  • Chapter 4: The basics of raster data handling using the raster and terra packages
    • understand object classes by the terra and raster packages
    • import and export raster data
    • stack raster data
    • quick plotting
    • handle netCDF files
  • Chapter 5: Spatial interactions of vector and raster datasets
    • cropping a raster layer to the geographic extent of a vector layer
    • extracting values from a raster layer to a vector layer
  • Chapter 6: Speed things up
    • make raster data extraction faster by parallelization
  • Chapter 7: Spatiotemporal raster data handling with the stars package
  • Chapter 8: Creating Maps using the ggplot2 package
    • use the ggplot2 packages to create maps
  • Chapter 9: Download and process publicly available spatial datasets (partially available)
    • USDA NASS QuickStat (tidyUSDA) - available
    • PRISM (prism) - available
    • Daymet (daymetr) - available
    • gridMET - available
    • Cropland Data Layer (CropScapeR) - available
    • SSURGO (tidycensus) - under construction
    • Census (tidycensus) - under construction
  • Appendix A: Loop and parallel computation
  • Appendix B: Cheatsheet - under construction

As you can see above, this book does not spend any time on the very basics of GIS concepts. Before you start reading the book, you should know the followings at least (it’s not much):

  • What Geographic Coordinate System (GCS), Coordinate Reference System (CRS), and projection are (this is a good resource)
  • Distinctions between vector and raster data (this is a simple summary of the difference)

This book is about spatial data processing and does not provide detailed explanations on non-spatial R operations, assuming some basic knowledge of R. In particular, the dplyr and data.table packages are extensively used for data wrangling. For data wrangling using tidyverse (a collection of packages including dplyr), see R for Data Science. For data.table, this is a good resource.

Finally, this book does not cover spatial statistics or spatial econometrics at all. This book is about spatial data processing. Spatial analysis is something you do after you have processed spatial data.

Conventions of the book and some notes

Here are some notes of the conventions of this book and notes for R beginners and those who are not used to reading rmarkdown-generated html documents.

Texts in gray boxes

They are one of the following:

  • objects defined on R during demonstrations
  • R functions
  • R packages

When it is a function, I always put parentheses at the end like this: st_read(). Sometimes, I combine a package and function in one like this: sf::st_read(). This means it is a function called st_read() from the sf package.

Colored Boxes

Codes are in blue boxes, and outcomes are in red boxes.

Codes:

Outcomes:

[1] 0.4915999 0.7632047 0.6585179 0.7324614 0.2124914

Parentheses around codes

Sometimes you will see codes enclosed by parenthesis like this:

(
  a <- runif(5)
)
[1] 0.5868415 0.7597267 0.2599178 0.9837541 0.8517167

The parentheses prints what’s inside of a newly created object (here a) without explicitly evaluating the object. So, basically I am signaling that we will be looking inside of the object that was just created.

This one prints nothing.

a <- runif(5)

Session Information

Here is the session information when compiling the book:

R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.6.1

Matrix products: default
BLAS:   /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib 
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib;  LAPACK version 3.12.0

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

time zone: America/Chicago
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] htmlwidgets_1.6.4 compiler_4.4.1    fastmap_1.2.0     cli_3.6.3        
 [5] tools_4.4.1       htmltools_0.5.8.1 rstudioapi_0.16.0 codetools_0.2-20 
 [9] rmarkdown_2.27    knitr_1.48        jsonlite_1.8.8    xfun_0.46        
[13] digest_0.6.36     rlang_1.1.4       evaluate_0.24.0  
Wickham, Hadley, and Jay Hesselberth. 2020. Pkgdown: Make Static HTML Documentation for a Package. https://CRAN.R-project.org/package=pkgdown.
Xie, Yihui. 2016. Bookdown: Authoring Books and Technical Documents with R Markdown. Boca Raton, Florida: Chapman; Hall/CRC. https://github.com/rstudio/bookdown.
Xie, Yihui, Alison Presmanes Hill, and Amber Thomas. 2017. Blogdown: Creating Websites with R Markdown. Boca Raton, Florida: Chapman; Hall/CRC. https://github.com/rstudio/blogdown.