7.4 Reading and writing raster data
There are so many formats in which raster data is stored. Some of the common ones include GeoTIFF, netCDF, GRIB. All the available GDAL drivers for reading and writing raster data can be found by the following code:81
::st_drivers(what = "raster") sf
The output of the above function is put into a table below.
7.4.1 Reading raster data
You can use read_stars()
to read a raster data file. It is very unlikely that the raster file you are trying to read is not in one of the supported formats.
For example, you can read a GeoTIFF file as follows:
(<- read_stars("Data/PRISM_ppt_y2009_m1.tif")
ppt_m1_y09_stars )
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
PRISM_ppt_y2009_m1.tif 0 0 0.436 3.543222 3.4925 56.208 60401
dimension(s):
from to offset delta refsys point
x 1 1405 -125.021 0.0416667 NAD83 FALSE
y 1 621 49.9375 -0.0416667 NAD83 FALSE
band 1 31 NA NA NA NA
values
x NULL
y NULL
band PRISM_ppt_stable_4kmD2_20090101_bil,...,PRISM_ppt_stable_4kmD2_20090131_bil
x/y
x [x]
y [y]
band
This one imports a raw PRISM data stored as a BIL file.
(<- read_stars("Data/PRISM_tmax_stable_4kmD2_20180701_bil/PRISM_tmax_stable_4kmD2_20180701_bil.bil")
prism_tmax_20180701 )
stars object with 2 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
PRISM_tmax_stable_4kmD2_201807... 3.623 25.318 31.596 29.50749 33.735 48.008
NA's
PRISM_tmax_stable_4kmD2_201807... 390874
dimension(s):
from to offset delta refsys point values x/y
x 1 1405 -125.021 0.0416667 NAD83 NA NULL [x]
y 1 621 49.9375 -0.0416667 NAD83 NA NULL [y]
You can import multiple raster data files into one stars
object by simply supplying a vector of file names:
<- c("Data/PRISM_tmax_stable_4kmD2_20180701_bil/PRISM_tmax_stable_4kmD2_20180701_bil.bil", "Data/PRISM_tmax_stable_4kmD2_20180702_bil/PRISM_tmax_stable_4kmD2_20180702_bil.bil")
files
(<- read_stars(files)
prism_tmax_201807_two )
stars object with 2 dimensions and 2 attributes
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
PRISM_tmax_stable_4kmD2_201807... 3.623 25.318 31.596 29.50749 33.735 48.008
PRISM_tmax_stable_4kmD2_201807... 0.808 26.505 30.632 29.93235 33.632 47.999
NA's
PRISM_tmax_stable_4kmD2_201807... 390874
PRISM_tmax_stable_4kmD2_201807... 390874
dimension(s):
from to offset delta refsys point values x/y
x 1 1405 -125.021 0.0416667 NAD83 NA NULL [x]
y 1 621 49.9375 -0.0416667 NAD83 NA NULL [y]
As you can see, each file becomes an attribute.82 It is more convenient to have them as layers stacked along the third dimension. To do that you can add along = 3
option as follows:
(<- read_stars(files, along = 3)
prism_tmax_201807_two )
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
Min. 1st Qu. Median Mean 3rd Qu. Max.
PRISM_tmax_stable_4kmD2_201807... 4.693 19.5925 23.653 22.05224 24.5945 33.396
NA's
PRISM_tmax_stable_4kmD2_201807... 60401
dimension(s):
from to offset delta refsys point values x/y
x 1 1405 -125.021 0.0416667 NAD83 NA NULL [x]
y 1 621 49.9375 -0.0416667 NAD83 NA NULL [y]
new_dim 1 2 NA NA NA NA NULL
Note that while the GeoTIFF format (and many other formats) can store multi-band (multi-layer) raster data allowing for an additional dimension beyond x
and y
, it does not store the values of the dimension like the date
dimension we saw in prcp_tmax_PRISM_m8_y09
. So, when you read a multi-layer raster data saved as a GeoTIFF, the third dimension of the resulting stars
object will always be called band
without any explicit time information. On the other hand, netCDF files are capable of storing the time dimension values. So, when you read a netCDF file with valid time dimension, you will have time dimension when it is read.
(read_ncdf(system.file("nc/bcsd_obs_1999.nc", package = "stars"))
)
stars object with 3 dimensions and 2 attributes
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max. NA's
pr [mm/m] 0.5900000 56.139999 81.88000 101.26433 121.07250 848.54999 7116
tas [C] -0.4209678 8.898887 15.65763 15.48932 21.77979 29.38581 7116
dimension(s):
from to offset delta refsys point values x/y
longitude 1 81 -85 0.125 WGS 84 NA NULL [x]
latitude 1 33 33 0.125 WGS 84 NA NULL [y]
time 1 12 NA NA POSIXct NA 1999-01-31,...,1999-12-31
The refsys
of the time dimension is POSIXct
, which is one of the date classes.
7.4.2 Writing a stars
object to a file
You can write a stars
object to a file using write_stars()
using one of the GDAL drivers.
Let’s save prcp_tmax_PRISM_m8_y09["tmax",,,]
, which has date
dimension whose refsys
is Date
.
write_stars(prcp_tmax_PRISM_m8_y09["tmax",,,], "Data/tmax_m8_y09_from_stars.tif")
Let’s read the file we just saved.
read_stars("Data/tmax_m8_y09_from_stars.tif")
stars object with 3 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
tmax_m8_y09_from_stars.tif 1.833 17.55575 21.483 22.03543 26.54275 39.707
dimension(s):
from to offset delta refsys point values x/y
x 1 20 -121.729 0.0416667 NAD83 FALSE NULL [x]
y 1 20 46.6458 -0.0416667 NAD83 FALSE NULL [y]
band 1 10 NA NA NA NA NULL
Notice that the third dimension is now called band and all the date information is lost. The loss of information happened when we saved prcp_tmax_PRISM_m8_y09["tmax",,,]
as a GeoTIFF file. One easy way to avoid this problem is to just save a stars
object as an R dataset.
#--- save it as an rds file ---#
saveRDS(prcp_tmax_PRISM_m8_y09["tmax",,,], "Data/tmax_m8_y09_from_stars.rds")
#--- read it back ---#
readRDS("Data/tmax_m8_y09_from_stars.rds")
stars object with 3 dimensions and 1 attribute
attribute(s):
Min. 1st Qu. Median Mean 3rd Qu. Max.
tmax 1.833 17.55575 21.483 22.03543 26.54275 39.707
dimension(s):
from to offset delta refsys point values x/y
x 1 20 -121.729 0.0416667 NAD83 FALSE NULL [x]
y 1 20 46.6458 -0.0416667 NAD83 FALSE NULL [y]
date 1 10 2009-08-11 1 days Date NA NULL
As you can see, date information is retained. So, if you are the only one who uses this data or all of your team members use R, then this is a nice solution to the problem.83 At the moment, it is not possible to use write_stars()
to write to a netCDF file that supports the third dimension as time. However, this may not be the case for a long time (See the discussion here).
What drivers are available varies depending on your particular installation of GDAL version. But, this is not something you need to worry about. It is very unlikely that you are reading or writing raster data in the format that is not available.↩︎
You can use↩︎
But, you would be giving up the opportunity to use
stars_proxy
by doing so.↩︎