7.4 Reading and writing raster data

There are so many formats in which raster data is stored. Some of the common ones include GeoTIFF, netCDF, GRIB. All the available GDAL drivers for reading and writing raster data can be found by the following code:81

sf::st_drivers(what = "raster") 

The output of the above function is put into a table below.

7.4.1 Reading raster data

You can use read_stars() to read a raster data file. It is very unlikely that the raster file you are trying to read is not in one of the supported formats.

For example, you can read a GeoTIFF file as follows:

(
ppt_m1_y09_stars <- read_stars("Data/PRISM_ppt_y2009_m1.tif") 
)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
                        Min. 1st Qu. Median     Mean 3rd Qu.   Max.  NA's
PRISM_ppt_y2009_m1.tif     0       0  0.436 3.543222  3.4925 56.208 60401
dimension(s):
     from   to   offset      delta refsys point
x       1 1405 -125.021  0.0416667  NAD83 FALSE
y       1  621  49.9375 -0.0416667  NAD83 FALSE
band    1   31       NA         NA     NA    NA
                                                                          values
x                                                                           NULL
y                                                                           NULL
band PRISM_ppt_stable_4kmD2_20090101_bil,...,PRISM_ppt_stable_4kmD2_20090131_bil
     x/y
x    [x]
y    [y]
band    

This one imports a raw PRISM data stored as a BIL file.

(
prism_tmax_20180701 <- read_stars("Data/PRISM_tmax_stable_4kmD2_20180701_bil/PRISM_tmax_stable_4kmD2_20180701_bil.bil")
)
stars object with 2 dimensions and 1 attribute
attribute(s):
                                    Min. 1st Qu. Median     Mean 3rd Qu.   Max.
PRISM_tmax_stable_4kmD2_201807...  3.623  25.318 31.596 29.50749  33.735 48.008
                                     NA's
PRISM_tmax_stable_4kmD2_201807...  390874
dimension(s):
  from   to   offset      delta refsys point values x/y
x    1 1405 -125.021  0.0416667  NAD83    NA   NULL [x]
y    1  621  49.9375 -0.0416667  NAD83    NA   NULL [y]

You can import multiple raster data files into one stars object by simply supplying a vector of file names:

files <- c("Data/PRISM_tmax_stable_4kmD2_20180701_bil/PRISM_tmax_stable_4kmD2_20180701_bil.bil", "Data/PRISM_tmax_stable_4kmD2_20180702_bil/PRISM_tmax_stable_4kmD2_20180702_bil.bil")

(
prism_tmax_201807_two <- read_stars(files)
)
stars object with 2 dimensions and 2 attributes
attribute(s):
                                    Min. 1st Qu. Median     Mean 3rd Qu.   Max.
PRISM_tmax_stable_4kmD2_201807...  3.623  25.318 31.596 29.50749  33.735 48.008
PRISM_tmax_stable_4kmD2_201807...  0.808  26.505 30.632 29.93235  33.632 47.999
                                     NA's
PRISM_tmax_stable_4kmD2_201807...  390874
PRISM_tmax_stable_4kmD2_201807...  390874
dimension(s):
  from   to   offset      delta refsys point values x/y
x    1 1405 -125.021  0.0416667  NAD83    NA   NULL [x]
y    1  621  49.9375 -0.0416667  NAD83    NA   NULL [y]

As you can see, each file becomes an attribute.82 It is more convenient to have them as layers stacked along the third dimension. To do that you can add along = 3 option as follows:

(
prism_tmax_201807_two <- read_stars(files, along = 3)
)
stars object with 3 dimensions and 1 attribute
attribute(s), summary of first 1e+05 cells:
                                    Min. 1st Qu. Median     Mean 3rd Qu.   Max.
PRISM_tmax_stable_4kmD2_201807...  4.693 19.5925 23.653 22.05224 24.5945 33.396
                                    NA's
PRISM_tmax_stable_4kmD2_201807...  60401
dimension(s):
        from   to   offset      delta refsys point values x/y
x          1 1405 -125.021  0.0416667  NAD83    NA   NULL [x]
y          1  621  49.9375 -0.0416667  NAD83    NA   NULL [y]
new_dim    1    2       NA         NA     NA    NA   NULL    

Note that while the GeoTIFF format (and many other formats) can store multi-band (multi-layer) raster data allowing for an additional dimension beyond x and y, it does not store the values of the dimension like the date dimension we saw in prcp_tmax_PRISM_m8_y09. So, when you read a multi-layer raster data saved as a GeoTIFF, the third dimension of the resulting stars object will always be called band without any explicit time information. On the other hand, netCDF files are capable of storing the time dimension values. So, when you read a netCDF file with valid time dimension, you will have time dimension when it is read.

(
read_ncdf(system.file("nc/bcsd_obs_1999.nc", package = "stars"))  
)
stars object with 3 dimensions and 2 attributes
attribute(s):
                Min.   1st Qu.   Median      Mean   3rd Qu.      Max. NA's
pr [mm/m]  0.5900000 56.139999 81.88000 101.26433 121.07250 848.54999 7116
tas [C]   -0.4209678  8.898887 15.65763  15.48932  21.77979  29.38581 7116
dimension(s):
          from to offset delta  refsys point                    values x/y
longitude    1 81    -85 0.125  WGS 84    NA                      NULL [x]
latitude     1 33     33 0.125  WGS 84    NA                      NULL [y]
time         1 12     NA    NA POSIXct    NA 1999-01-31,...,1999-12-31    

The refsys of the time dimension is POSIXct, which is one of the date classes.

7.4.2 Writing a stars object to a file

You can write a stars object to a file using write_stars() using one of the GDAL drivers.

Let’s save prcp_tmax_PRISM_m8_y09["tmax",,,], which has date dimension whose refsys is Date.

write_stars(prcp_tmax_PRISM_m8_y09["tmax",,,], "Data/tmax_m8_y09_from_stars.tif")

Let’s read the file we just saved.

read_stars("Data/tmax_m8_y09_from_stars.tif")
stars object with 3 dimensions and 1 attribute
attribute(s):
                             Min.  1st Qu. Median     Mean  3rd Qu.   Max.
tmax_m8_y09_from_stars.tif  1.833 17.55575 21.483 22.03543 26.54275 39.707
dimension(s):
     from to   offset      delta refsys point values x/y
x       1 20 -121.729  0.0416667  NAD83 FALSE   NULL [x]
y       1 20  46.6458 -0.0416667  NAD83 FALSE   NULL [y]
band    1 10       NA         NA     NA    NA   NULL    

Notice that the third dimension is now called band and all the date information is lost. The loss of information happened when we saved prcp_tmax_PRISM_m8_y09["tmax",,,] as a GeoTIFF file. One easy way to avoid this problem is to just save a stars object as an R dataset.

#--- save it as an rds file ---#
saveRDS(prcp_tmax_PRISM_m8_y09["tmax",,,], "Data/tmax_m8_y09_from_stars.rds")
#--- read it back ---#
readRDS("Data/tmax_m8_y09_from_stars.rds") 
stars object with 3 dimensions and 1 attribute
attribute(s):
       Min.  1st Qu. Median     Mean  3rd Qu.   Max.
tmax  1.833 17.55575 21.483 22.03543 26.54275 39.707
dimension(s):
     from to     offset      delta refsys point values x/y
x       1 20   -121.729  0.0416667  NAD83 FALSE   NULL [x]
y       1 20    46.6458 -0.0416667  NAD83 FALSE   NULL [y]
date    1 10 2009-08-11     1 days   Date    NA   NULL    

As you can see, date information is retained. So, if you are the only one who uses this data or all of your team members use R, then this is a nice solution to the problem.83 At the moment, it is not possible to use write_stars() to write to a netCDF file that supports the third dimension as time. However, this may not be the case for a long time (See the discussion here).


  1. What drivers are available varies depending on your particular installation of GDAL version. But, this is not something you need to worry about. It is very unlikely that you are reading or writing raster data in the format that is not available.↩︎

  2. You can use↩︎

  3. But, you would be giving up the opportunity to use stars_proxy by doing so.↩︎