7 Spatiotemporal Raster Data Handling with stars
Before you start
In this Chapter, we introduce the stars
package (Pebesma 2020) for raster data handling. It can be particularly useful for those who use spatiotemporal raster data often (like daily PRISM and Daymet data) because it brings a framework that provides a consistent treatment of raster data with temporal dimensions. Specifically, stars
objects can have a time dimension in addition to the spatial 2D dimensions (longitude and latitude), where the time dimension can take Date
values.82 This can be handy for several reasons as you will see below (e.g., filtering the data by date).
Another advantage of the stars
package is its compatibility with sf
objects as the lead developer of the two packages is the same person. Therefore, unlike the terra
package approach, we do not need any tedious conversions between sf
and SpatVector
. The stars
package also allows dplyr
-like data operations using functions like filter()
, mutate()
(see section 7.6).
In Chapters 4 and 5, we used the raster
and terra
packages to handle raster data and interact raster data with vector data. If you do not feel any inconvenience with the approach, you do not need to read on. Also, note that the stars
package was not written to replace either raster
or terra
packages. Here is a good summary of how raster
functions map to stars
functions. As you can see, there are many functions that are available to the raster
packages that cannot be implemented by the stars
package. However, I must say the functionality of the stars
package is rather complete at least for most economists, and it is definitely possible to use just the stars
package for all the raster data work in most cases.83
Finally, this book does not cover the use of stars_proxy
for big data that does not fit in your memory, which may be useful for some of you. This provides an introduction to stars_proxy
for those interested. This book also does not cover irregular raster cells (e.g., curvelinear grids). Interested readers are referred to here.
Direction for replication
Datasets
All the datasets that you need to import are available here. In this chapter, the path to files is set relative to my own working directory (which is hidden). To run the codes without having to mess with paths to the files, follow these steps:
- set a folder (any folder) as the working directory using
setwd()
- create a folder called “Data” inside the folder designated as the working directory (if you have created a “Data” folder previously, skip this step)
- download the pertinent datasets from here and put them in the “Data” folder
Packages
- Run the following code to install or load (if already installed) the
pacman
package, and then install or load (if already installed) the listed package inside thepacman::p_load()
function.
if (!require("pacman")) install.packages("pacman")
pacman::p_load(
stars, # spatiotemporal data handling
sf, # vector data handling
tidyverse, # data wrangling
cubelyr, # handle raster data
tmap, # make maps
mapview, # make maps
exactextractr, # fast raster data extraction
lubridate, # handle dates
prism # download PRISM data
)
- Run the following code to define the theme for map:
theme_set(theme_bw())
theme_for_map <- theme(
axis.ticks = element_blank(),
axis.text = element_blank(),
axis.line = element_blank(),
panel.border = element_blank(),
panel.grid.major = element_line(color = "transparent"),
panel.grid.minor = element_line(color = "transparent"),
panel.background = element_blank(),
plot.background = element_rect(fill = "transparent", color = "transparent")
)
7.1 Understanding the structure of a stars
object
Let’s import a stars
object of daily PRISM precipitation and tmax saved as an R dataset.
#--- read PRISM prcp and tmax data ---#
(
prcp_tmax_PRISM_m8_y09 <- readRDS("Data/prcp_tmax_PRISM_m8_y09_small.rds")
)
stars object with 3 dimensions and 2 attributes
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
ppt 0.000 0.00000 0.000 1.292334 0.01100 30.851
tmax 1.833 17.55575 21.483 22.035435 26.54275 39.707
dimension(s):
from to offset delta refsys point x/y
x 1 20 -121.7 0.04167 NAD83 FALSE [x]
y 1 20 46.65 -0.04167 NAD83 FALSE [y]
date 1 10 2009-08-11 1 days Date NA
This stars
object has two attributes: ppt
(precipitation) and tmax
(maximum temperature). They are like variables in a regular data.frame
. They hold information we are interested in using for our analysis.
This stars
object has three dimensions: x
(longitude), y
(latitude), and date
. Each dimension has from
, to
, offset
, delta
, refsys
, point
, and values
. Here are their definitions (except point
84):
-
from
: beginning index -
to
: ending index -
offset
: starting value -
delta
: step value -
refsys
: GCS or CRS forx
andy
(and can beDate
for a date dimension) -
values
: values of the dimension when objects are not regular
In order to understand the dimensions of stars
objects, let’s first take a look at the figure below (Figure 7.1), which visualizes the tmax values on the 2D x
-y
surfaces of the stars
objects at 10th date
value (the spatial distribution of tmax at a particular date).
You can consider the 2D x-y surface as a matrix, where the location of a cell is defined by the row number and column number. Since the from
for x
is 1 and to
for x
is 20, we have 20 columns. Similarly, since the from
for y
is 1 and to
for y
is 20, we have 20 rows. The offset
value of the x
and y
dimensions is the longitude and latitude of the upper-left corner point of the upper left cell of the 2D x-y surface, respectively (the red circle in Figure 7.1).85 As refsys
indicates, they are in NAD83 GCS. The longitude of the upper-left corner point of all the cells in the \(j\)th column (from the left) of the 2D x-y surface is -121.7291667 + \((j-1)\times\) 0.0416667, where -121.7291667 is offset
for x
and 0.0416667 is delta
for x
. Similarly, the latitude of the upper-left corner point of all the cells in the \(i\)th row (from the top) of the 2D x-y surface is 46.6458333 +\((i-1)\times\) -0.0416667, where 46.6458333 is offset
for y
and -0.0416667 is delta
for y
.
The dimension characteristics of x
and y
are shared by all the layers across the date dimension, and a particular combination of x
and y
indexes refers to exactly the same location on the earth in all the layers across dates (of course). In the date
dimension, we have 10 date values since the from
for date
is 1 and to
for date
is 10. The refsys
of the date
dimension is Date
. Since the offset
is 2009-08-11 and delta
is 1, \(k\)th layer represents tmax values for August \(11+k-1\), 2009.
Putting all this information together, we have 20 by 20 x-y surfaces stacked over the date
dimension (10 layers), thus making up a 20 by 20 by 10 three-dimensional array (or cube) as shown in the figure below (Figure 7.2).
Remember that prcp_tmax_PRISM_m8_y09
also has another attribute (ppt
) which is structured exactly the same way as tmax
. So, prcp_tmax_PRISM_m8_y09
basically has four dimensions: attribute, x
, y
, and date
.
It is not guaranteed that all the dimensions are regularly spaced or timed. For an irregular dimension, dimension values themselves are stored in values
instead of using indexes, offset
, and delta
to find dimension values. For example, if you observe satellite data with 6-day gaps sometimes and 7-day gaps other times, then the date dimension would be irregular. We will see a made-up example of irregular time dimension in Chapter 7.5.
7.2 Some basic operations on stars
objects
7.2.1 Create a new attribute and drop a attribute
You can add a new attribute to a stars
object just like you add a variable to a data.frame
.
#--- define a new variable which is tmax - 20 ---#
prcp_tmax_PRISM_m8_y09$tmax_less_20 <- prcp_tmax_PRISM_m8_y09$tmax - 20
As you can see below, prcp_tmax_PRISM_m8_y09
now has tmax_less_20
as an attribute.
prcp_tmax_PRISM_m8_y09
stars object with 3 dimensions and 3 attributes
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
ppt 0.000 0.00000 0.000 1.292334 0.01100 30.851
tmax 1.833 17.55575 21.483 22.035435 26.54275 39.707
tmax_less_20 -18.167 -2.44425 1.483 2.035435 6.54275 19.707
dimension(s):
from to offset delta refsys point x/y
x 1 20 -121.7 0.04167 NAD83 FALSE [x]
y 1 20 46.65 -0.04167 NAD83 FALSE [y]
date 1 10 2009-08-11 1 days Date NA
You can drop an attribute by assining NULL
to the attribute you want to drop.
prcp_tmax_PRISM_m8_y09$tmax_less_20 <- NULL
prcp_tmax_PRISM_m8_y09
stars object with 3 dimensions and 2 attributes
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
ppt 0.000 0.00000 0.000 1.292334 0.01100 30.851
tmax 1.833 17.55575 21.483 22.035435 26.54275 39.707
dimension(s):
from to offset delta refsys point x/y
x 1 20 -121.7 0.04167 NAD83 FALSE [x]
y 1 20 46.65 -0.04167 NAD83 FALSE [y]
date 1 10 2009-08-11 1 days Date NA
7.2.2 Subset a stars
object by index
In order to access the value of an attribute (say ppt
) at a particular location at a particular time from the stars
object (prcp_tmax_PRISM_m8_y09
), you need to tell R that you are interested in the ppt
attribute and specify the corresponding index of x
, y
, and date
. Here, is how you get the ppt
value of (3, 4) cell at date = 10.
prcp_tmax_PRISM_m8_y09["ppt", 3, 4, 10]
stars object with 3 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
ppt 0 0 0 0 0 0
dimension(s):
from to offset delta refsys point x/y
x 3 3 -121.7 0.04167 NAD83 FALSE [x]
y 4 4 46.65 -0.04167 NAD83 FALSE [y]
date 10 10 2009-08-11 1 days Date NA
This is very much similar to accessing a value from a matrix except that we have more dimensions. The first argument selects the attribute of interest, 2nd x
, 3rd y
, and 4th date
.
Of course, you can subset a stars
object to access the value of multiple cells like this:
prcp_tmax_PRISM_m8_y09["ppt", 3:6, 3:4, 5:10]
stars object with 3 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
ppt 0 0 0 0.043125 0 0.546
dimension(s):
from to offset delta refsys point x/y
x 3 6 -121.7 0.04167 NAD83 FALSE [x]
y 3 4 46.65 -0.04167 NAD83 FALSE [y]
date 5 10 2009-08-11 1 days Date NA
7.2.3 Set attribute names
When you read a raster dataset from a GeoTIFF file, the attribute name is the file name by default. So, you will often encounter cases where you want to change the attribute name. You can set the name of attributes using setNames()
.
prcp_tmax_PRISM_m8_y09_dif_names <- setNames(prcp_tmax_PRISM_m8_y09, c("precipitation", "maximum_temp"))
Note that the attribute names of prcp_tmax_PRISM_m8_y09
have not changed after this operation (just like mutate()
or other dplyr
functions do):
prcp_tmax_PRISM_m8_y09
stars object with 3 dimensions and 2 attributes
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
ppt 0.000 0.00000 0.000 1.292334 0.01100 30.851
tmax 1.833 17.55575 21.483 22.035435 26.54275 39.707
dimension(s):
from to offset delta refsys point x/y
x 1 20 -121.7 0.04167 NAD83 FALSE [x]
y 1 20 46.65 -0.04167 NAD83 FALSE [y]
date 1 10 2009-08-11 1 days Date NA
If you want to reflect changes in the variable names while keeping the same object name, you need to assign the output of setNames()
to the object as follows:
# the codes in the rest of the chapter use "ppt" and "tmax" as the variables names, not these ones
prcp_tmax_PRISM_m8_y09 <- setNames(prcp_tmax_PRISM_m8_y09, c("precipitation", "maximum_temp"))
We will be using this function in Chapter 7.7.2.
7.2.4 Get the coordinate reference system
You can get the CRS of a stars
object using st_crs()
, which is actually the same function name that we used to extract the CRS of an sf
object.
st_crs(prcp_tmax_PRISM_m8_y09)
Coordinate Reference System:
User input: NAD83
wkt:
GEOGCRS["NAD83",
DATUM["North American Datum 1983",
ELLIPSOID["GRS 1980",6378137,298.257222101004,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
CS[ellipsoidal,2],
AXIS["geodetic latitude (Lat)",north,
ORDER[1],
ANGLEUNIT["degree",0.0174532925199433]],
AXIS["geodetic longitude (Lon)",east,
ORDER[2],
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4269]]
You will use this to change the projection of vector datasets when you interact them:
- crop the
stars
raster dataset to the spatial extent ofsf
objects - extract values from the
stars
raster dataset forsf
objects
as we will see in Chapter 5. Notice also that we used exactly the same function name (st_crs()
) to get the CRS of sf
objects (see Chapter 2).
7.2.5 Get the dimension characteristics and values
You can access these dimension values using st_dimensions()
.
#--- get dimension characteristics ---#
(
dim_prcp_tmin <- st_dimensions(prcp_tmax_PRISM_m8_y09)
)
from to offset delta refsys point x/y
x 1 20 -121.7 0.04167 NAD83 FALSE [x]
y 1 20 46.65 -0.04167 NAD83 FALSE [y]
date 1 10 2009-08-11 1 days Date NA
The following shows what we can get from this object.
str(dim_prcp_tmin)
List of 3
$ x :List of 7
..$ from : num 1
..$ to : num 20
..$ offset: num -122
..$ delta : num 0.0417
..$ refsys:List of 2
.. ..$ input: chr "NAD83"
.. ..$ wkt : chr "GEOGCRS[\"NAD83\",\n DATUM[\"North American Datum 1983\",\n ELLIPSOID[\"GRS 1980\",6378137,298.257222"| __truncated__
.. ..- attr(*, "class")= chr "crs"
..$ point : logi FALSE
..$ values: NULL
..- attr(*, "class")= chr "dimension"
$ y :List of 7
..$ from : num 1
..$ to : num 20
..$ offset: num 46.6
..$ delta : num -0.0417
..$ refsys:List of 2
.. ..$ input: chr "NAD83"
.. ..$ wkt : chr "GEOGCRS[\"NAD83\",\n DATUM[\"North American Datum 1983\",\n ELLIPSOID[\"GRS 1980\",6378137,298.257222"| __truncated__
.. ..- attr(*, "class")= chr "crs"
..$ point : logi FALSE
..$ values: NULL
..- attr(*, "class")= chr "dimension"
$ date:List of 7
..$ from : num 1
..$ to : int 10
..$ offset: Date[1:1], format: "2009-08-11"
..$ delta : 'difftime' num 1
.. ..- attr(*, "units")= chr "days"
..$ refsys: chr "Date"
..$ point : logi NA
..$ values: NULL
..- attr(*, "class")= chr "dimension"
- attr(*, "raster")=List of 4
..$ affine : num [1:2] 0 0
..$ dimensions : chr [1:2] "x" "y"
..$ curvilinear: logi FALSE
..$ blocksizes : int [1:10, 1:2] 20 20 20 20 20 20 20 20 20 20 ...
.. ..- attr(*, "dimnames")=List of 2
.. .. ..$ : NULL
.. .. ..$ : chr [1:2] "x" "y"
..- attr(*, "class")= chr "stars_raster"
- attr(*, "class")= chr "dimensions"
For example, you can get offset
of x
as follows:
dim_prcp_tmin$x$offset
[1] -121.7292
You can extract dimension values using st_get_dimension_values()
. For example, to get values of date
,
#--- get date values ---#
st_get_dimension_values(prcp_tmax_PRISM_m8_y09, "date")
[1] "2009-08-11" "2009-08-12" "2009-08-13" "2009-08-14" "2009-08-15"
[6] "2009-08-16" "2009-08-17" "2009-08-18" "2009-08-19" "2009-08-20"
This can be handy as you will see in Chapter 9.4.2. Later in Chapter 7.5, we will learn how to set dimensions using st_set_dimensions()
.
7.2.6 Attributes to dimensions, and vice versa
You can make attributes to dimensions using merge()
.
(
prcp_tmax_four <- merge(prcp_tmax_PRISM_m8_y09)
)
stars object with 4 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
ppt.tmax 0 0 11.799 11.66388 21.535 39.707
dimension(s):
from to offset delta refsys point values x/y
x 1 20 -121.7 0.04167 NAD83 FALSE NULL [x]
y 1 20 46.65 -0.04167 NAD83 FALSE NULL [y]
date 1 10 2009-08-11 1 days Date NA NULL
attributes 1 2 NA NA NA NA ppt , tmax
As you can see, the new stars
object has an additional dimension called attributes
, which represents attributes and has two dimension values: the first for ppt
and second for tmax
. Now, if you want to access ppt
, you can do the following:
prcp_tmax_four[, , , , "ppt"]
stars object with 4 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
ppt.tmax 0 0 0 1.292334 0.011 30.851
dimension(s):
from to offset delta refsys point values x/y
x 1 20 -121.7 0.04167 NAD83 FALSE NULL [x]
y 1 20 46.65 -0.04167 NAD83 FALSE NULL [y]
date 1 10 2009-08-11 1 days Date NA NULL
attributes 1 1 NA NA NA NA ppt
We can do this because the merge kept the attribute names as dimension values as you can see in values
.
You can revert it back to the original state using split()
. Since we want the fourth dimension to dissolve,
split(prcp_tmax_four, 4)
stars object with 3 dimensions and 2 attributes
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
ppt 0.000 0.00000 0.000 1.292334 0.01100 30.851
tmax 1.833 17.55575 21.483 22.035435 26.54275 39.707
dimension(s):
from to offset delta refsys point x/y
x 1 20 -121.7 0.04167 NAD83 FALSE [x]
y 1 20 46.65 -0.04167 NAD83 FALSE [y]
date 1 10 2009-08-11 1 days Date NA
7.3 Quick visualization for exploration
You can use plot()
to have a quick static map and mapview()
or the tmap
package for interactive views.
7.3.1 quick static map
plot(prcp_tmax_PRISM_m8_y09["tmax", , , ])
It only plots the first attribute if the stars
object has multiple attributes:
plot(prcp_tmax_PRISM_m8_y09)
, which is identical with this:
plot(prcp_tmax_PRISM_m8_y09["ppt", , , ])
7.3.2 interactive map
You can use the tmap
package. You can apply tmap_leaflet()
to a static tmap
object to make it an interactive map. The tm_facets(as.layers = TRUE)
option stacks all the layers in a single map.
#--- make it interactive ---#
tmap_leaflet(
tm_shape(prcp_tmax_PRISM_m8_y09["tmax", , , ]) +
tm_raster() +
tm_facets(as.layers = TRUE)
)
7.4 Reading and writing raster data
There are so many formats in which raster data is stored. Some of the common ones include GeoTIFF, netCDF, GRIB. All the available GDAL drivers for reading and writing raster data can be found by the following code:86
sf::st_drivers(what = "raster")
The output of the above function is put into a table below.
7.4.1 Reading raster data
You can use read_stars()
to read a raster data file. It is very unlikely that the raster file you are trying to read is not in one of the supported formats.
For example, you can read a GeoTIFF file as follows:
(
ppt_m1_y09_stars <- read_stars("Data/PRISM_ppt_y2009_m1.tif")
)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
PRISM_ppt_y2009_m1.tif 0 0 0.436 3.543222 3.4925 56.208 60401
dimension(s):
from to offset delta refsys point
x 1 1405 -125 0.04167 NAD83 FALSE
y 1 621 49.94 -0.04167 NAD83 FALSE
band 1 31 NA NA NA NA
values
x NULL
y NULL
band PRISM_ppt_stable_4kmD2_20090101_bil,...,PRISM_ppt_stable_4kmD2_20090131_bil
x/y
x [x]
y [y]
band
This one imports a raw PRISM data stored as a BIL file.
(
prism_tmax_20180701 <- read_stars("Data/PRISM_tmax_stable_4kmD2_20180701_bil/PRISM_tmax_stable_4kmD2_20180701_bil.bil")
)
stars object with 2 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
PRISM_tmax_stable_4kmD2_201... 3.623 25.318 31.596 29.50749 33.735 48.008
NA's
PRISM_tmax_stable_4kmD2_201... 390874
dimension(s):
from to offset delta refsys x/y
x 1 1405 -125 0.04167 NAD83 [x]
y 1 621 49.94 -0.04167 NAD83 [y]
You can import multiple raster data files into one stars
object by simply supplying a vector of file names:
files <-
c(
"Data/PRISM_tmax_stable_4kmD2_20180701_bil/PRISM_tmax_stable_4kmD2_20180701_bil.bil",
"Data/PRISM_tmax_stable_4kmD2_20180702_bil/PRISM_tmax_stable_4kmD2_20180702_bil.bil"
)
(
prism_tmax_201807_two <- read_stars(files)
)
stars object with 2 dimensions and 2 attributes
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
1_bil/PRISM_tmax_stable_4km... 3.623 25.318 31.596 29.50749 33.735 48.008
2_bil/PRISM_tmax_stable_4km... 0.808 26.505 30.632 29.93235 33.632 47.999
NA's
1_bil/PRISM_tmax_stable_4km... 390874
2_bil/PRISM_tmax_stable_4km... 390874
dimension(s):
from to offset delta refsys x/y
x 1 1405 -125 0.04167 NAD83 [x]
y 1 621 49.94 -0.04167 NAD83 [y]
As you can see, each file becomes an attribute.87 It is more convenient to have them as layers stacked along the third dimension. To do that you can add along = 3
option as follows:
(
prism_tmax_201807_two <- read_stars(files, along = 3)
)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max.
PRISM_tmax_stable_4kmD2_201... 4.693 19.5925 23.653 22.05224 24.5945 33.396
NA's
PRISM_tmax_stable_4kmD2_201... 60401
dimension(s):
from to offset delta refsys
x 1 1405 -125 0.04167 NAD83
y 1 621 49.94 -0.04167 NAD83
new_dim 1 2 NA NA NA
values
x NULL
y NULL
new_dim 1_bil/PRISM_tmax_stable_4kmD2_20180701, 2_bil/PRISM_tmax_stable_4kmD2_20180702
x/y
x [x]
y [y]
new_dim
Note that while the GeoTIFF format (and many other formats) can store multi-band (multi-layer) raster data allowing for an additional dimension beyond x
and y
, it does not store the values of the dimension like the date
dimension we saw in prcp_tmax_PRISM_m8_y09
. So, when you read a multi-layer raster data saved as a GeoTIFF, the third dimension of the resulting stars
object will always be called band
without any explicit time information. On the other hand, netCDF files are capable of storing the time dimension values. So, when you read a netCDF file with valid time dimension, you will have time dimension when it is read.
(
read_ncdf(system.file("nc/bcsd_obs_1999.nc", package = "stars"))
)
stars object with 3 dimensions and 2 attributes
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
pr [mm/m] 0.5900000 56.139999 81.88000 101.26433 121.07250 848.54999 7116
tas [C] -0.4209678 8.898887 15.65763 15.48932 21.77979 29.38581 7116
dimension(s):
from to offset delta refsys values x/y
longitude 1 81 -85 0.125 WGS 84 NULL [x]
latitude 1 33 33 0.125 WGS 84 NULL [y]
time 1 12 NA NA POSIXct 1999-01-31,...,1999-12-31
The refsys
of the time dimension is POSIXct
, which is one of the date classes.
7.4.2 Writing a stars
object to a file
You can write a stars
object to a file using write_stars()
using one of the GDAL drivers.
Let’s save prcp_tmax_PRISM_m8_y09["tmax",,,]
, which has date
dimension whose refsys
is Date
.
write_stars(prcp_tmax_PRISM_m8_y09["tmax", , , ], "Data/tmax_m8_y09_from_stars.tif")
Let’s read the file we just saved.
read_stars("Data/tmax_m8_y09_from_stars.tif")
stars object with 3 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
tmax_m8_y09_from_stars.tif 1.833 17.55575 21.483 22.03543 26.54275 39.707
dimension(s):
from to offset delta refsys point x/y
x 1 20 -121.7 0.04167 NAD83 FALSE [x]
y 1 20 46.65 -0.04167 NAD83 FALSE [y]
band 1 10 NA NA NA NA
Notice that the third dimension is now called band and all the date information is lost. The loss of information happened when we saved prcp_tmax_PRISM_m8_y09["tmax",,,]
as a GeoTIFF file. One easy way to avoid this problem is to just save a stars
object as an R dataset.
#--- save it as an rds file ---#
saveRDS(prcp_tmax_PRISM_m8_y09["tmax", , , ], "Data/tmax_m8_y09_from_stars.rds")
#--- read it back ---#
readRDS("Data/tmax_m8_y09_from_stars.rds")
stars object with 3 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
tmax 1.833 17.55575 21.483 22.03543 26.54275 39.707
dimension(s):
from to offset delta refsys point x/y
x 1 20 -121.7 0.04167 NAD83 FALSE [x]
y 1 20 46.65 -0.04167 NAD83 FALSE [y]
date 1 10 2009-08-11 1 days Date NA
As you can see, date information is retained. So, if you are the only one who uses this data or all of your team members use R, then this is a nice solution to the problem.88 At the moment, it is not possible to use write_stars()
to write to a netCDF file that supports the third dimension as time. However, this may not be the case for a long time (See the discussion here).
7.5 Setting the time dimension manually
For this section, we will use PRISM precipitation data for U.S. for January, 2009.
(
ppt_m1_y09_stars <- read_stars("Data/PRISM_ppt_y2009_m1.tif")
)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
PRISM_ppt_y2009_m1.tif 0 0 0.436 3.543222 3.4925 56.208 60401
dimension(s):
from to offset delta refsys point
x 1 1405 -125 0.04167 NAD83 FALSE
y 1 621 49.94 -0.04167 NAD83 FALSE
band 1 31 NA NA NA NA
values
x NULL
y NULL
band PRISM_ppt_stable_4kmD2_20090101_bil,...,PRISM_ppt_stable_4kmD2_20090131_bil
x/y
x [x]
y [y]
band
As you can see, when you read a GeoTIFF file, the third dimension will always be called band
because GeoTIFF format does not support dimension values for the third dimension.89
You can use st_set_dimension()
to set the third dimension (called band
) as the time dimension using Date
object. This can be convenient when you would like to filter the data by date using filter()
as we will see later.
For ppt_m1_y09_stars
, precipitation is observed on a daily basis from January 1, 2009 to January 31, 2009, where the band value of x corresponds to January x, 2009. So, we can first create a vector of dates as follows (If you are not familiar with Dates
and the lubridate
pacakge, this is a good resource to learn them.):
#--- starting date ---#
start_date <- ymd("2009-01-01")
#--- ending date ---#
end_date <- ymd("2009-01-31")
#--- sequence of dates ---#
dates_ls_m1 <- seq(start_date, end_date, "days")
We can then use st_set_dimensions()
to change the third dimension to the dimension of date.
{ eval = F} st_set_dimensions( stars object, dimension, values = dimension values, names = name of the dimension )
(
ppt_m1_y09_stars <-
st_set_dimensions(
ppt_m1_y09_stars,
3,
values = dates_ls_m1,
names = "date"
)
)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
PRISM_ppt_y2009_m1.tif 0 0 0.436 3.543222 3.4925 56.208 60401
dimension(s):
from to offset delta refsys point x/y
x 1 1405 -125 0.04167 NAD83 FALSE [x]
y 1 621 49.94 -0.04167 NAD83 FALSE [y]
date 1 31 2009-01-01 1 days Date NA
As you can see, the third dimension has become date. The value of offset for the date dimension has become 2009-01-01
meaning the starting date value is now 2009-01-01
. Further, the value of delta is now 1 days, so date dimension of x corresponds to 2009-01-01
+ x - 1 = 2009-01-x
.
Note that the date dimension does not have to be regularly spaced. For example, you may have satellite images available for your area with a 5-day interval sometimes and a 6-day interval other times. This is perfectly fine. As an illustration, I will create a wrong sequence of dates for this data with a 2-day gap in the middle and assign them to the date dimension to see what happens.
#--- 2009-01-23 removed and 2009-02-01 added ---#
(
dates_ls_wrong <- c(seq(start_date, end_date, "days")[-23], ymd("2009-02-01"))
)
[1] "2009-01-01" "2009-01-02" "2009-01-03" "2009-01-04" "2009-01-05"
[6] "2009-01-06" "2009-01-07" "2009-01-08" "2009-01-09" "2009-01-10"
[11] "2009-01-11" "2009-01-12" "2009-01-13" "2009-01-14" "2009-01-15"
[16] "2009-01-16" "2009-01-17" "2009-01-18" "2009-01-19" "2009-01-20"
[21] "2009-01-21" "2009-01-22" "2009-01-24" "2009-01-25" "2009-01-26"
[26] "2009-01-27" "2009-01-28" "2009-01-29" "2009-01-30" "2009-01-31"
[31] "2009-02-01"
Now assign these date values to ppt_m1_y09_stars
:
#--- set date values ---#
(
ppt_m1_y09_stars_wrong <-
st_set_dimensions(
ppt_m1_y09_stars,
3,
values = dates_ls_wrong,
names = "date"
)
)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
PRISM_ppt_y2009_m1.tif 0 0 0.436 3.543222 3.4925 56.208 60401
dimension(s):
from to offset delta refsys point values x/y
x 1 1405 -125 0.04167 NAD83 FALSE NULL [x]
y 1 621 49.94 -0.04167 NAD83 FALSE NULL [y]
date 1 31 NA NA Date NA 2009-01-01,...,2009-02-01
Since the step between the date values is no longer \(1\) day for the entire sequence, the value of delta
is now NA
. However, notice that the value of date is no longer NULL. Since the date is not regular, you cannot represent date using three values (from
, to
, and delta
) any more, and date values for each observation have to be stored now.
Finally, note that just applying st_set_dimensions()
to a stars
object does not change the dimension of the stars
object (just like setNames()
as we discussed above).
ppt_m1_y09_stars
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
PRISM_ppt_y2009_m1.tif 0 0 0.436 3.543222 3.4925 56.208 60401
dimension(s):
from to offset delta refsys point x/y
x 1 1405 -125 0.04167 NAD83 FALSE [x]
y 1 621 49.94 -0.04167 NAD83 FALSE [y]
date 1 31 2009-01-01 1 days Date NA
As you can see, the date dimension has not been altered. You need to assign the results of st_set_dimensions()
to a stars
object to see the changes in the dimension reflected just like we did above with the right date values.
7.6 dplyr
-like operations
You can use the dplyr
language to do basic data operations on stars
objects.
7.6.1 filter()
The filter()
function allows you to subset data by dimension values: x, y, and band (here date).
spatial filtering
library(tidyverse)
#--- longitude greater than -100 ---#
filter(ppt_m1_y09_stars, x > -100) %>% plot()
temporal filtering
Finally, since the date dimension is in Date
, you can use Date
math to filter the data.90
filter by attribute?
Just in case you are wondering. You cannot filter by attribute.
filter(ppt_m1_y09_stars, ppt > 20)
Error in `map2_int()`:
ℹ In index: 1.
Caused by error in `glubort()`:
! ``~``, `ppt > 20` must refer to exactly one dimension, not ``
7.6.2 select()
The select()
function lets you pick certain attributes.
select(prcp_tmax_PRISM_m8_y09, ppt)
stars object with 3 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
ppt 0 0 0 1.292334 0.011 30.851
dimension(s):
from to offset delta refsys point x/y
x 1 20 -121.7 0.04167 NAD83 FALSE [x]
y 1 20 46.65 -0.04167 NAD83 FALSE [y]
date 1 10 2009-08-11 1 days Date NA
7.6.3 mutate()
You can mutate attributes using the mutate()
function. For example, this can be useful to calculate NDVI in a stars
object that has Red and NIR (spectral reflectance measurements in the red and near-infrared regions) as attributes. Here, we just simply convert the unit of precipitation from mm to inches.
#--- mm to inches ---#
mutate(prcp_tmax_PRISM_m8_y09, ppt = ppt * 0.0393701)
stars object with 3 dimensions and 2 attributes
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
ppt 0.000 0.00000 0.000 0.0508793 4.330711e-04 1.214607
tmax 1.833 17.55575 21.483 22.0354348 2.654275e+01 39.707001
dimension(s):
from to offset delta refsys point x/y
x 1 20 -121.7 0.04167 NAD83 FALSE [x]
y 1 20 46.65 -0.04167 NAD83 FALSE [y]
date 1 10 2009-08-11 1 days Date NA
7.6.4 pull()
You can extract attribute values using pull()
.
#--- tmax values of the 1st date layer ---#
pull(prcp_tmax_PRISM_m8_y09["tmax", , , 1], "tmax")
, , 1
[,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,] 25.495 27.150 22.281 19.553 21.502 19.616 21.664 24.458 21.712 19.326
[2,] 27.261 27.042 21.876 19.558 19.141 19.992 18.694 24.310 21.094 20.032
[3,] 27.568 21.452 23.220 19.428 18.974 19.366 20.293 24.418 20.190 21.127
[4,] 23.310 20.357 20.111 21.380 19.824 19.638 22.135 20.885 20.121 18.396
[5,] 19.571 20.025 18.407 19.142 19.314 22.759 22.652 19.566 18.817 16.065
[6,] 17.963 16.581 17.121 16.716 19.680 20.521 17.991 18.007 17.430 14.586
[7,] 20.911 19.202 16.309 14.496 17.429 19.083 18.509 18.723 17.645 16.415
[8,] 17.665 16.730 17.691 14.370 16.849 18.413 17.787 20.000 19.319 18.401
[9,] 16.795 18.091 20.460 16.405 18.331 19.005 19.142 21.226 21.184 19.520
[10,] 19.208 19.624 17.210 19.499 18.492 20.854 18.684 19.811 22.058 19.923
[11,] 23.148 18.339 19.676 20.674 18.545 21.126 19.013 19.722 21.843 21.271
[12,] 23.254 21.279 21.921 19.894 19.445 21.499 19.765 20.742 21.560 22.989
[13,] 23.450 21.956 19.813 18.970 20.173 20.567 21.152 20.932 19.836 20.347
[14,] 24.075 21.120 20.166 19.177 20.428 20.908 21.060 19.832 19.764 19.981
[15,] 24.318 20.943 20.024 20.022 19.040 19.773 20.452 20.152 20.321 20.304
[16,] 22.538 19.461 20.100 21.149 19.958 20.486 20.535 20.445 21.564 21.493
[17,] 20.827 20.192 21.165 22.369 21.488 22.031 21.552 21.089 21.687 23.375
[18,] 21.089 21.451 22.692 21.793 22.160 23.049 22.562 22.738 23.634 24.697
[19,] 22.285 22.992 23.738 23.497 24.255 25.177 25.411 24.324 24.588 26.032
[20,] 23.478 23.584 24.589 24.719 26.114 26.777 27.310 26.643 26.516 26.615
[,11] [,12] [,13] [,14] [,15] [,16] [,17] [,18] [,19] [,20]
[1,] 24.845 23.211 21.621 23.180 21.618 21.322 22.956 23.749 23.162 25.746
[2,] 24.212 21.820 21.305 22.960 22.806 22.235 23.604 23.689 25.597 25.933
[3,] 20.879 20.951 22.280 23.536 23.455 22.537 24.760 23.587 26.170 24.764
[4,] 17.435 18.681 22.224 24.122 25.828 25.604 25.298 22.817 24.438 24.835
[5,] 14.186 17.789 20.624 23.416 26.059 27.571 25.158 24.201 26.001 26.235
[6,] 10.188 15.632 19.907 22.660 25.268 27.469 27.376 27.488 27.278 27.558
[7,] 14.797 15.933 19.204 21.641 23.107 25.626 26.990 25.838 26.906 27.247
[8,] 17.325 18.299 19.691 21.553 21.840 23.754 26.099 25.270 26.282 26.981
[9,] 19.322 19.855 20.489 22.597 23.614 25.873 26.906 26.368 26.332 25.844
[10,] 20.241 21.800 22.111 24.128 25.765 27.105 27.200 25.491 26.306 25.663
[11,] 23.398 24.090 24.884 25.596 26.545 27.014 26.464 25.708 25.742 25.336
[12,] 22.576 21.996 23.874 26.447 26.955 26.871 25.533 25.576 25.610 25.902
[13,] 22.023 22.358 24.996 26.185 27.249 25.617 25.623 25.600 25.433 26.681
[14,] 20.974 23.533 25.388 25.975 27.316 26.199 26.090 25.920 25.767 27.956
[15,] 20.982 23.632 24.703 25.539 26.515 27.133 27.407 27.518 27.149 28.506
[16,] 22.500 24.012 25.282 25.751 25.212 25.290 26.058 28.258 28.290 29.842
[17,] 23.024 24.381 25.157 25.259 24.829 24.183 25.632 26.947 28.601 29.589
[18,] 24.016 24.425 24.965 24.930 24.482 23.274 25.412 26.733 28.494 29.656
[19,] 24.065 24.105 24.145 24.318 23.912 22.782 25.039 26.554 28.184 29.012
[20,] 23.979 24.321 23.477 22.135 22.395 22.189 24.944 26.542 27.923 28.849
7.7 Merging stars
objects using c()
and st_mosaic()
7.7.1 Merging stars
objects along the third dimension (band)
Here we learn how to merge multiple stars
objects that have
- the same attributes
- the same spatial extent and resolution
- different bands (dates here)
For example, consider merging PRISM precipitation data in January and February. Both of them have exactly the same spatial extent and resolutions and represent the same attribute (precipitation). However, they differ in the third dimension (date). So, you are trying to stack data of the same attributes along the third dimension (date) while making sure that spatial correspondence is maintained. This merge is kind of like rbind()
that stacks multiple data.frame
s vertically while making sure the variables are aligned correctly.
Let’s import the PRISM precipitation data for February, 2009.
#--- read the February ppt data ---#
(
ppt_m2_y09_stars <- read_stars("Data/PRISM_ppt_y2009_m2.tif")
)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
PRISM_ppt_y2009_m2.tif 0 0 0 0.1855858 0 11.634 60401
dimension(s):
from to offset delta refsys point
x 1 1405 -125 0.04167 NAD83 FALSE
y 1 621 49.94 -0.04167 NAD83 FALSE
band 1 28 NA NA NA NA
values
x NULL
y NULL
band PRISM_ppt_stable_4kmD2_20090201_bil,...,PRISM_ppt_stable_4kmD2_20090228_bil
x/y
x [x]
y [y]
band
Note here that the third dimension of ppt_m2_y09_stars
has not been changed to date.
Now, let’s try to merge the two:
#--- combine the two ---#
(
ppt_m1_to_m2_y09_stars <- c(ppt_m1_y09_stars, ppt_m2_y09_stars, along = 3)
)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
PRISM_ppt_y2009_m1.tif 0 0 0.436 3.543222 3.4925 56.208 60401
dimension(s):
from to offset delta refsys point x/y
x 1 1405 -125 0.04167 NAD83 FALSE [x]
y 1 621 49.94 -0.04167 NAD83 FALSE [y]
date 1 59 2009-01-01 1 days Date NA
As you noticed, the second object (ppt_m2_y09_stars
) was assumed to have the same date characteristics as the first one: the data in February is observed daily (delta
is 1 day). This causes no problem in this instance as the February data is indeed observed daily starting from 2009-02-01
. However, be careful if you are appending the data that does not start from 1 day after (or more generally delta
for the time dimension) the first data or the data that does not follow the same observation interval.
For this reason, it is advisable to first set the date values if it has not been set. Pretend that the February data actually starts from 2009-02-02
to 2009-03-01
to see what happens when the regular interval (delta
) is not kept after merging.
#--- starting date ---#
start_date <- ymd("2009-02-01")
#--- ending date ---#
end_date <- ymd("2009-02-28")
#--- sequence of dates ---#
dates_ls <- seq(start_date, end_date, "days")
#--- pretend the data actually starts from `2009-02-02` to `2009-03-01` ---#
(
ppt_m2_y09_stars <- st_set_dimensions(ppt_m2_y09_stars, 3, values = c(dates_ls[-1], ymd("2009-03-01")), name = "date")
)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
PRISM_ppt_y2009_m2.tif 0 0 0 0.1855858 0 11.634 60401
dimension(s):
from to offset delta refsys point x/y
x 1 1405 -125 0.04167 NAD83 FALSE [x]
y 1 621 49.94 -0.04167 NAD83 FALSE [y]
date 1 28 2009-02-02 1 days Date NA
If you merge the two,
c(ppt_m1_y09_stars, ppt_m2_y09_stars, along = 3)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
PRISM_ppt_y2009_m1.tif 0 0 0.436 3.543222 3.4925 56.208 60401
dimension(s):
from to offset delta refsys point values x/y
x 1 1405 -125 0.04167 NAD83 FALSE NULL [x]
y 1 621 49.94 -0.04167 NAD83 FALSE NULL [y]
date 1 59 NA NA Date NA 2009-01-01,...,2009-03-01
The date dimension does not have delta
any more, and correctly so because there is a one-day gap between the end date of the first stars
object (“2009-01-31”) and the start date of the second stars
object (“2009-02-02”). So, all the date values are now stored in values
.
7.7.2 Merging stars
objects of different attributes
Here we learn how to merge multiple stars
objects that have
- different attributes
- the same spatial extent and resolution
- the same bands (dates here)
For example, consider merging PRISM precipitation and tmax data in January. Both of them have exactly the same spatial extent and resolutions and the date characteristics (starting and ending on the same dates with the same time interval). However, they differ in what they represent: precipitation and tmax. This merge is kind of like cbind()
that combines multiple data.frame
s of different variables while making sure the observation correspondence is correct.
Let’s read the daily tmax data for January, 2009:
(
tmax_m1_y09_stars <- read_stars("Data/PRISM_tmax_y2009_m1.tif") %>%
#--- change the attribute name ---#
setNames("tmax")
)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
tmax -17.898 -7.761 -2.248 -3.062721 1.718 7.994 60401
dimension(s):
from to offset delta refsys point
x 1 1405 -125 0.04167 NAD83 FALSE
y 1 621 49.94 -0.04167 NAD83 FALSE
band 1 31 NA NA NA NA
values
x NULL
y NULL
band PRISM_tmax_stable_4kmD2_20090101_bil,...,PRISM_tmax_stable_4kmD2_20090131_bil
x/y
x [x]
y [y]
band
Now, let’s merge the PRISM ppt and tmax data in January, 2009.
c(ppt_m1_y09_stars, tmax_m1_y09_stars)
Error in c.stars(ppt_m1_y09_stars, tmax_m1_y09_stars): don't know how to merge arrays: please specify parameter along
Oops. Well, the problem is that the third dimension of the two objects is not the same. Even though we know that the xth element of their third dimension represent the same thing, they look different to R’s eyes. So, we first need to change the third dimension of tmax_m1_y09_stars
to be consistent with the third dimension of ppt_m1_y09_stars
(dates_ls_m1
was defined in Chapter 7.5).
(
tmax_m1_y09_stars <- st_set_dimensions(tmax_m1_y09_stars, 3, values = dates_ls_m1, names = "date")
)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
tmax -17.898 -7.761 -2.248 -3.062721 1.718 7.994 60401
dimension(s):
from to offset delta refsys point x/y
x 1 1405 -125 0.04167 NAD83 FALSE [x]
y 1 621 49.94 -0.04167 NAD83 FALSE [y]
date 1 31 2009-01-01 1 days Date NA
Now, we can merge the two.
(
ppt_tmax_m1_y09_stars <- c(ppt_m1_y09_stars, tmax_m1_y09_stars)
)
stars object with 3 dimensions and 2 attributes
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
PRISM_ppt_y2009_m1.tif 0.000 0.000 0.436 3.543222 3.4925 56.208 60401
tmax -17.898 -7.761 -2.248 -3.062721 1.7180 7.994 60401
dimension(s):
from to offset delta refsys point x/y
x 1 1405 -125 0.04167 NAD83 FALSE [x]
y 1 621 49.94 -0.04167 NAD83 FALSE [y]
date 1 31 2009-01-01 1 days Date NA
As you can see, now we have another attribute called tmax
.
7.7.3 Merging stars
objects of different spatial extents
Here we learn how to merge multiple stars
objects that have
- the same attributes
- different spatial extent but same resolution91
- the same bands (dates here)
Some times you have multiple separate raster datasets that have different spatial coverages and would like to combine them into one. You can do that using st_mosaic()
.
Let’s split tmax_m1_y09_stars
into two parts:
Here (Figure 7.3) is what they look like (only Jan 1, 1980):
g_1 <- ggplot() +
geom_stars(data = tmax_1[, , , 1]) +
theme_for_map +
theme(
legend.position = "bottom"
)
g_2 <- ggplot() +
geom_stars(data = tmax_2[, , , 1]) +
theme_for_map +
theme(
legend.position = "bottom"
)
library(patchwork)
g_1 + g_2
Let’s combine the two using st_mosaic()
:
tmax_combined <- st_mosaic(tmax_1, tmax_2)
Here is what the combined object looks like (Figure 7.4)
ggplot() +
geom_stars(data = tmax_combined[, , , 1]) +
theme_for_map
It is okay to have the two stars
objects to be combined have a spatial overlap. The following split creates two stars
objects with a spatial overlap.
Here (Figure 7.5) is what they look like (only Jan 1, 1980):
g_1 <- ggplot() +
geom_stars(data = tmax_m1_y09_stars[, , , 1], fill = NA) +
geom_stars(data = tmax_1[, , , 1]) +
theme_for_map
g_2 <- ggplot() +
geom_stars(data = tmax_m1_y09_stars[, , , 1], fill = NA) +
geom_stars(data = tmax_2[, , , 1]) +
theme_for_map
library(patchwork)
g_1 / g_2
As you can see below, st_mosaic()
reconciles the spatial overlap between the two stars
objects.
(
tmax_combined <- st_mosaic(tmax_1, tmax_2)
)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
tmax -17.898 -7.761 -2.248 -3.062721 1.718 7.994 60401
dimension(s):
from to offset delta refsys x/y
x 1 1405 -125 0.04167 NAD83 [x]
y 1 621 49.94 -0.04167 NAD83 [y]
band 1 31 NA NA NA
Here is the plot (Figure 7.6):
ggplot() +
geom_stars(data = tmax_combined[, , , 1]) +
theme_for_map
7.8 Apply a function to one or more dimensions
Sometimes, you want to apply a function to one or more dimensions of a stars
object. For example, you might want to calculate total annual precipitation for each cell from daily precipitation raster data. For illustration, let’s create a single-attribute stars
object that holds daily precipitation in August in 2009.
(
ppt_m8_y09 <- prcp_tmax_PRISM_m8_y09["ppt"]
)
stars object with 3 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
ppt 0 0 0 1.292334 0.011 30.851
dimension(s):
from to offset delta refsys point x/y
x 1 20 -121.7 0.04167 NAD83 FALSE [x]
y 1 20 46.65 -0.04167 NAD83 FALSE [y]
date 1 10 2009-08-11 1 days Date NA
Let’s find total precipitation for each cell in August in 2009 using st_apply()
, which works like this.
st_apply(
X = stars object,
MARGIN = margin,
FUN = function to apply
)
Since we want to sum the value of attribute (ppt) for each of the cells (each cell is represented by x-y
), we specify MARGIN = c("x", "y")
and use sum()
as the function.
monthly_ppt <-
st_apply(
ppt_m8_y09,
MARGIN = c("x", "y"),
FUN = sum
)
plot(monthly_ppt)
7.9 Convert from and to Raster
\(^*\) or SpatRaster
objects
When you need to convert a stars
object to a Raster
\(^*\) or SpatRaster
object, you can use the as()
function as follows:
(
# to Raster* object
prcp_tmax_PRISM_m8_y09_rb <- as(prcp_tmax_PRISM_m8_y09, "Raster")
)
class : RasterBrick
dimensions : 20, 20, 400, 10 (nrow, ncol, ncell, nlayers)
resolution : 0.04166667, 0.04166667 (x, y)
extent : -121.7292, -120.8958, 45.8125, 46.64583 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=NAD83 +no_defs
source : memory
names : layer.1, layer.2, layer.3, layer.4, layer.5, layer.6, layer.7, layer.8, layer.9, layer.10
min values : 0, 0, 0, 0, 0, 0, 0, 0, 0, 0
max values : 0.019, 18.888, 30.851, 4.287, 0.797, 0.003, 3.698, 1.801, 0.000, 0.000
time : 2009-08-11, 2009-08-12, 2009-08-13, 2009-08-14, 2009-08-15, 2009-08-16, 2009-08-17, 2009-08-18, 2009-08-19, 2009-08-20
(
# to SpatRaster
prcp_tmax_PRISM_m8_y09_sr <- as(prcp_tmax_PRISM_m8_y09, "SpatRaster")
)
class : SpatRaster
dimensions : 20, 20, 10 (nrow, ncol, nlyr)
resolution : 0.04166667, 0.04166667 (x, y)
extent : -121.7292, -120.8958, 45.8125, 46.64583 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat NAD83 (EPSG:4269)
source(s) : memory
names : date2~08-11, date2~08-12, date2~08-13, date2~08-14, date2~08-15, date2~08-16, ...
min values : 0.000, 0.000, 0.000, 0.000, 0.000, 0.000, ...
max values : 0.019, 18.888, 30.851, 4.287, 0.797, 0.003, ...
As you can see, date values in prcp_tmax_PRISM_m8_y09
appear in the time dimension of prcp_tmax_PRISM_m8_y09_rb
(RasterBrick
). However, in prcp_tmax_PRISM_m8_y09_sr
(SpatRaster
), date values appear in the names
dimension with date
added as prefix to the original date values.
Note also that the conversion was done for only the ppt
attribute. This is simply because the raster
and terra
package does not accommodate multiple attributes of 3-dimensional array. So, if you want a RasterBrick
or SpatRaster
of the tmax data, then you need to do the following:
(
# to RasterBrick
tmax_PRISM_m8_y09_rb <- as(prcp_tmax_PRISM_m8_y09["tmax", , , ], "Raster")
)
class : RasterBrick
dimensions : 20, 20, 400, 10 (nrow, ncol, ncell, nlayers)
resolution : 0.04166667, 0.04166667 (x, y)
extent : -121.7292, -120.8958, 45.8125, 46.64583 (xmin, xmax, ymin, ymax)
crs : +proj=longlat +datum=NAD83 +no_defs
source : memory
names : layer.1, layer.2, layer.3, layer.4, layer.5, layer.6, layer.7, layer.8, layer.9, layer.10
min values : 10.188, 8.811, 6.811, 3.640, 1.833, 6.780, 10.318, 14.270, 18.281, 19.818
max values : 29.842, 28.892, 27.123, 23.171, 21.858, 24.690, 27.740, 32.906, 35.568, 39.707
time : 2009-08-11, 2009-08-12, 2009-08-13, 2009-08-14, 2009-08-15, 2009-08-16, 2009-08-17, 2009-08-18, 2009-08-19, 2009-08-20
(
# to SpatRaster
tmax_PRISM_m8_y09_sr <- as(prcp_tmax_PRISM_m8_y09["tmax", , , ], "SpatRaster")
)
class : SpatRaster
dimensions : 20, 20, 10 (nrow, ncol, nlyr)
resolution : 0.04166667, 0.04166667 (x, y)
extent : -121.7292, -120.8958, 45.8125, 46.64583 (xmin, xmax, ymin, ymax)
coord. ref. : lon/lat NAD83 (EPSG:4269)
source(s) : memory
names : date2~08-11, date2~08-12, date2~08-13, date2~08-14, date2~08-15, date2~08-16, ...
min values : 10.188, 8.811, 6.811, 3.640, 1.833, 6.78, ...
max values : 29.842, 28.892, 27.123, 23.171, 21.858, 24.69, ...
You can convert a Raster
\(^*\) object to a stars
object using st_as_stars()
(you will see a use case in Chapter 9.4.2).
(
tmax_PRISM_m8_y09_back_to_stars <- st_as_stars(tmax_PRISM_m8_y09_rb)
)
stars object with 3 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
layer.1 1.833 17.55575 21.483 22.03543 26.54275 39.707
dimension(s):
from to offset delta refsys x/y
x 1 20 -121.7 0.04167 NAD83 [x]
y 1 20 46.65 -0.04167 NAD83 [y]
band 1 10 2009-08-11 1 days Date
Notice that the original date values (stored in the date
dimension) are recovered, but its dimension is now called just band
.
You can do the same with a SpatRaster
object.
(
tmax_PRISM_m8_y09_back_to_stars <- st_as_stars(tmax_PRISM_m8_y09_sr)
)
stars object with 3 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
date2009-08-11 1.833 17.55575 21.483 22.03543 26.54275 39.707
dimension(s):
from to offset delta refsys values x/y
x 1 20 -121.7 0.04167 NAD83 NULL [x]
y 1 20 46.65 -0.04167 NAD83 NULL [y]
band 1 10 NA NA NA date2009-08-11,...,date2009-08-20
Note that unlike the conversion from the RasterBrick
object, the band
dimension inherited values of the name
dimension in tmax_PRISM_m8_y09_sr
.