runif(5)
Preface
How is this book any different from other online books and resources?
We are seeing an explosion of online (and free) resources that teach how to use R for spatial data processing.1 Here is an incomplete list of such resources:
1 This phenomenon is largely thanks to packages like bookdown
(Xie 2016), blogdown
(Xie, Hill, and Thomas 2017), and pkgdown
(Wickham and Hesselberth 2020) that has significantly lowered the cost of professional contents creation than before. Indeed, this book was built taking advantage of the bookdown
package.
- Geocomputation with R
- Spatial Data Science
- Spatial Data Science with R
- Introduction to GIS using R
- Code for An Introduction to Spatial Analysis and Mapping in R
- Introduction to GIS in R
- Intro to GIS and Spatial Analysis
- Introduction to Spatial Data Programming with R
- Reproducible GIS analysis with R
- R for Earth-System Science
- Rspatial
- NEON Data Skills
- Simple Features for R
- Geospatial Health Data: Modeling and Visualization with R-INLA and Shiny
Thanks to all these resources, it has become much easier to self-teach R for GIS work than 10 years ago when I first started using R for GIS. Even though I have not read through all these resources carefully, I am pretty sure every topic found in this book can also be found somewhere in these resources (except the demonstrations). So, you may wonder why on earth you can benefit from reading this book. It all boils down to search costs. Researchers in different disciplines require different sets of spatial data skills. The available resources are typically very general covering so many topics, some of which economists are unlikely to use. It is particularly hard for those who do not have much experience in GIS to identify whether particular skills are essential or not. So, they could spend so much time learning something that is not really useful. The value of this book lies in its deliberate incomprehensiveness. It only packages materials that satisfy the need of most economists, cutting out many topics that are likely to be of limited use for economists.
For those who are looking for more comprehensive treatments of spatial data handling and processing in one book, I personally like Geocomputation with R a lot. Increasingly, the developer of R packages created a website dedicated to their R packages, where you can often find vignettes (tutorials), like Simple Features for R.
Topics covered in this book
The book starts with the very basics of spatial data handling (e.g., importing and exporting spatial datasets) and moves on to more practical spatial data operations (e.g., spatial data join) that are useful for research projects. Some parts of this books are still under development. Right now, Chapters 1 through 8, parts of Chapter 9, and Appendix A are available.
- Chapter 1: Demonstrations of R as GIS
- groundwater pumping and groundwater level
- precision agriculture
- land use and weather
- corn planted acreage and railroads
- groundwater pumping and weather
- slave trade and economic development in Africa
- terrain ruggedness and economic development in Africa
- TseTse fly and economic developtment in Africa
- Chapter 2: The basics of vector data handling using
sf
package- spatial data structure in
sf
- import and export vector data
- (re)projection of spatial datasets
- single-layer geometrical operations (e.g., create buffers, find centroids)
- other miscellaneous basic operations
- spatial data structure in
- Chapter 3: Spatial interactions of vector datasets
- understand topological relations of multiple
sf
objects - spatially subset a layer based on another layer
- extracting values from one layer to another layer
- understand topological relations of multiple
- Chapter 4: The basics of raster data handling using the
raster
andterra
packages- understand object classes by the
terra
andraster
packages - import and export raster data
- stack raster data
- quick plotting
- handle netCDF files
- understand object classes by the
- Chapter 5: Spatial interactions of vector and raster datasets
- cropping a raster layer to the geographic extent of a vector layer
- extracting values from a raster layer to a vector layer
- Chapter 6: Speed things up
- make raster data extraction faster by parallelization
- Chapter 7: Spatiotemporal raster data handling with the
stars
package - Chapter 8: Creating Maps using the
ggplot2
package- use the
ggplot2
packages to create maps
- use the
- Chapter 9: Download and process publicly available spatial datasets (partially available)
- USDA NASS QuickStat (
tidyUSDA
) - available - PRISM (
prism
) - available - Daymet (
daymetr
) - available - gridMET - available
- Cropland Data Layer (
CropScapeR
) - available - SSURGO (
tidycensus
) - under construction - Census (
tidycensus
) - under construction
- USDA NASS QuickStat (
- Appendix A: Loop and parallel computation
- Appendix B: Cheatsheet - under construction
As you can see above, this book does not spend any time on the very basics of GIS concepts. Before you start reading the book, you should know the followings at least (it’s not much):
- What Geographic Coordinate System (GCS), Coordinate Reference System (CRS), and projection are (this is a good resource)
- Distinctions between vector and raster data (this is a simple summary of the difference)
This book is about spatial data processing and does not provide detailed explanations on non-spatial R operations, assuming some basic knowledge of R. In particular, the dplyr
and data.table
packages are extensively used for data wrangling. For data wrangling using tidyverse
(a collection of packages including dplyr
), see R for Data Science. For data.table
, this is a good resource.
Finally, this book does not cover spatial statistics or spatial econometrics at all. This book is about spatial data processing. Spatial analysis is something you do after you have processed spatial data.
Conventions of the book and some notes
Here are some notes of the conventions of this book and notes for R beginners and those who are not used to reading rmarkdown
-generated html documents.
Texts in gray boxes
They are one of the following:
- objects defined on R during demonstrations
- R functions
- R packages
When it is a function, I always put parentheses at the end like this: st_read()
. Sometimes, I combine a package and function in one like this: sf::st_read()
. This means it is a function called st_read()
from the sf
package.
Colored Boxes
Codes are in blue boxes, and outcomes are in red boxes.
Codes:
Outcomes:
[1] 0.4915999 0.7632047 0.6585179 0.7324614 0.2124914
Parentheses around codes
Sometimes you will see codes enclosed by parenthesis like this:
(
a <- runif(5)
)
[1] 0.5868415 0.7597267 0.2599178 0.9837541 0.8517167
The parentheses prints what’s inside of a newly created object (here a
) without explicitly evaluating the object. So, basically I am signaling that we will be looking inside of the object that was just created.
This one prints nothing.
a <- runif(5)
Session Information
Here is the session information when compiling the book:
R version 4.4.1 (2024-06-14)
Platform: aarch64-apple-darwin20
Running under: macOS Sonoma 14.6.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.4-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.0
locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
time zone: America/Chicago
tzcode source: internal
attached base packages:
[1] stats graphics grDevices utils datasets methods base
loaded via a namespace (and not attached):
[1] htmlwidgets_1.6.4 compiler_4.4.1 fastmap_1.2.0 cli_3.6.3
[5] tools_4.4.1 htmltools_0.5.8.1 rstudioapi_0.16.0 codetools_0.2-20
[9] rmarkdown_2.27 knitr_1.48 jsonlite_1.8.8 xfun_0.46
[13] digest_0.6.36 rlang_1.1.4 evaluate_0.24.0