Spatial Data Analysis with R
Society for Conservation GIS, July 2021

Geoprocessing

Geoprocessing

Generally speaking, geoprocessing functions are functions that analyze or alter the geometry of the data.

We can generally divide geoprocessing into 3 categories:

Geoprocessing functions do their calculations with map units.

If area or distance measurements are involved, use projected data.

Geometric Measurements

Function Computes
st_area() area of polygons
st_length() length / perimeter
st_distance() distance between features

Measurement functions from sf compute length and area in map units, but the units are saved as part of the result. This makes it easy to convert to using set_units() function from units package.

Example: Compute the Area of Yosemite

To get accurate results, we should project the boundary into UTM:

## Define a convenience variable for UTM Zone 11
epsg_utm11n_nad83 <- 26911

## Import the YNP border
yose_bnd_utm <- st_read(dsn="./data", layer="yose_boundary") %>% 
  st_transform(epsg_utm11n_nad83)
## Reading layer `yose_boundary' from data source `D:\Workshops\R-Spatial\rspatial_mod\outputs\rspatial_data\data' using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 11 fields
## Geometry type: POLYGON
## Dimension:     XY
## Bounding box:  xmin: -119.8864 ymin: 37.4947 xmax: -119.1964 ymax: 38.18515
## Geodetic CRS:  North_American_Datum_1983

Next, find the area:

## Compute the area of the park
yose_area <- yose_bnd_utm %>% st_area()
yose_area
## 3019918537 [m^2]

Large values of meters-squared can be hard to interpret. Let’s convert the area hectares and square miles using set_units() from the units package.

library(units)

## View in hectacres
set_units(yose_area, ha)

## View in square miles
set_units(yose_area, mi^2)
## 301991.9 [ha]
## 1165.997 [mi^2]

Import the burned area layer from historical fires, and compute the area of each fires (see code below to import them). Which was the biggest fire?

firehist_gdb <- "./data/yose_firehistory.gdb"
file.exists(firehist_gdb)

## View the layers in this source
st_layers(firehist_gdb)

## Import the 'Fire History Polys' layer  (case sensitive!)
yose_firepolys_utm <- st_read(firehist_gdb, layer="YNP_FireHistoryPolys")

[Hint] [Solution]

You can use which.max() to find the largest element of a vector.

## Compute the area of each polygon
fire_areas <- st_area(yose_firepolys_utm)

## Find the index of the largest value
i <- which.max(fire_areas)

## Take a look at the row
yose_firepolys_utm[i,]

## Computet the size in miles squared
units::set_units(fire_areas[i], mi^2)
## Simple feature collection with 1 feature and 21 fields
## Geometry type: MULTIPOLYGON
## Dimension:     XY
## Bounding box:  xmin: 246268.6 ymin: 4182145 xmax: 264049.5 ymax: 4215598
## Projected CRS: NAD83 / UTM zone 11N
##          FIRE_ID LINK_ID SACS_ID FILE_ID NAME    ACRES YEAR TYPE CAUSE HECTARES AREA PERIMETER SEQ_NO DECADE  STARTDATE    OUTDATE X_COORD Y_COORD
## 4251 CA-YNP-0126    <NA>    <NA>    <NA>  Rim 78892.24 2013   WF    HC       NA   NA   1123267     NA   2010 2013-08-17 2013-10-18      NA      NA
##      ET_ID Shape_Length Shape_Area                          Shape
## 4251    NA     209505.2  318835601 MULTIPOLYGON (((247876.4 42...
## 123.1031 [mi^2]

⇒ the 2013 Rim fire was the largest!

Compute Lengths

Computing lengths and perimeters is done similarly with st_length().

Example: How many miles of primary roads are there?

First import the roads and inspect the attribute table.

## Import the roads
gdb_fn <- "./data/yose_roads.gdb"
yose_roads <- st_read(gdb_fn, "Yosemite_Roads")
glimpse(yose_roads)
## Reading layer `Yosemite_Roads' from data source `D:\Workshops\R-Spatial\rspatial_mod\outputs\rspatial_data\data\yose_roads.gdb' using driver `OpenFileGDB'
## Simple feature collection with 823 features and 40 fields
## Geometry type: MULTILINESTRING
## Dimension:     XY
## Bounding box:  xmin: 234658.1 ymin: 4139484 xmax: 324852.6 ymax: 4250252
## Projected CRS: NAD83 / UTM zone 11N
## Rows: 823
## Columns: 41
## $ RDNAME         <chr> "Northside Drive", "Northside Drive", "Northside Drive", "Southside Drive", "Northside Drive", "Northside Drive", "Mirror ~
## $ RDALTNAME      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ MAPLABEL       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ RDSTATUS       <chr> "Existing", "Existing", "Existing", "Existing", "Existing", "Existing", "Existing", "Existing", "Existing", "Existing", "E~
## $ RDCLASS        <chr> "Primary", "Primary", "Primary", "Primary", "Primary", "Primary", "Secondary", "Primary", "Secondary", "Primary", "Primary~
## $ RDSURFACE      <chr> "Asphalt", "Asphalt", "Asphalt", "Asphalt", "Asphalt", "Asphalt", "Asphalt", "Asphalt", "Asphalt", "Asphalt", "Asphalt", "~
## $ RDONEWAY       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ RDLANES        <int> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ RDHICLEAR      <chr> "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "N~
## $ RTENUMBER      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ SEASONAL       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ SEASDESC       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ RDMAINTAINER   <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ ISEXTANT       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ PUBLICDISPLAY  <chr> "Public Map Display", "Public Map Display", "Public Map Display", "Public Map Display", "Public Map Display", "Public Map ~
## $ DATAACCESS     <chr> "Unrestricted", "Unrestricted", "Unrestricted", "Unrestricted", "Unrestricted", "Unrestricted", "Unrestricted", "Unrestric~
## $ UNITCODE       <chr> "YOSE", "YOSE", "YOSE", "YOSE", "YOSE", "YOSE", "YOSE", "YOSE", "YOSE", "YOSE", "YOSE", "YOSE", "YOSE", "YOSE", "YOSE", "Y~
## $ UNITNAME       <chr> "Yosemite National Park", "Yosemite National Park", "Yosemite National Park", "Yosemite National Park", "Yosemite National~
## $ GROUPCODE      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ GROUPNAME      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ REGIONCODE     <chr> "PWR", "PWR", "PWR", "PWR", "PWR", "PWR", "PWR", "PWR", "PWR", "PWR", "PWR", "PWR", "PWR", "PWR", "PWR", "PWR", "PWR", "PW~
## $ CREATEDATE     <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ CREATEUSER     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ EDITDATE       <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ EDITUSER       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ MAPMETHOD      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ MAPSOURCE      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ SOURCEDATE     <dttm> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ XYACCURACY     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ ROUTEID        <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ FACLOCID       <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ FACASSETID     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ FEATUREID      <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ GEOMETRYID     <chr> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA~
## $ NOTES          <chr> "In Park = Y, Fire Road = No", "In Park = Y, Fire Road = No", "In Park = Y, Fire Road = No", "In Park = Y, Fire Road = No"~
## $ YOSE_Surface   <chr> "Paved", "Paved", "Paved", "Paved", "Paved", "Paved", "Paved", "Paved", "Paved", "Paved", "Paved", "Paved", "Paved", "Pave~
## $ YOSE_FIRE_ROAD <chr> "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "No", "N~
## $ YOSE_INPARK    <chr> "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Yes", "Ye~
## $ YOSE_Type      <chr> "Primary Road", "Primary Road", "Primary Road", "Primary Road", "Primary Road", "Primary Road", "Secondary Road", "Primary~
## $ Shape_Length   <dbl> 493.53512, 67.90781, 33.35863, 258.00073, 117.94169, 167.51558, 71.82903, 330.03084, 73.37367, 80.21527, 173.71187, 114.03~
## $ Shape          <MULTILINESTRING [m]> MULTILINESTRING ((270865.2 ..., MULTILINESTRING ((270821.5 ..., MULTILINESTRING ((270797.2 ..., MULTILINES~

Next, write an expression that returns the total length of all primary roads.

## Load the units package so we can convert meters to miles
library(units)

## Use a mix of functions from dplyr, sf, base R, and units 
yose_roads %>% 
  dplyr::filter(RDCLASS == "Primary") %>%   ## grab just the primary roads
  sf::st_length() %>%                       ## compute length of each segment (as numeric vector)
  sum() %>%                                 ## take the sum
  units::set_units("miles")                 ## convert to miles
## 375.5348 [miles]

How many miles of Secondary roads are there?

[Solution]

yose_roads %>% 
  dplyr::filter(RDCLASS == "Secondary") %>% 
  st_length() %>% sum() %>% set_units("miles")
## 191.8279 [miles]

To add geometric measurements to the attribute table, use dplyr::mutate():

## Add road length to the attribute table
yose_roads <- yose_roads %>% 
  dplyr::mutate(road_length = st_length(yose_roads))

Calculating distances between features and finding nearest neighbors will be covered in the Spatial Queries session.

R Notebook Exercise 1

nb_geoproc_01_scgis21.Rmd
Geoprocessing measurement functions

preview notebook | answer key

Creating New Geometries

Manipulating Geometries

sf has a number of functions that will create new geometries based on existing ones:

Function Computes
st_buffer() compute a buffer around this geometry/each geometry
st_centroid() gives centroid of geometry
st_convex_hull() creates convex hull of set of points
st_line_merge() merges lines
st_segmentize() adds points to straight lines
st_voronoi() creates voronoi tesselation
st_triangulate() triangulates set of points (not constrained)
st_polygonize() creates polygon from lines that form a closed ring
st_simplify() simplifies lines by removing vertices
st_split() split a polygon given line geometry
st_boundary() return the boundary of a geometry

Example: Find centroids of watersheds

First, import the watersheds:

## Import watersheds
gpkg_watershd_fn <- "./data/yose_watersheds.gpkg"
file.exists(gpkg_watershd_fn)
yose_watersheds <- st_read(gpkg_watershd_fn, layer="calw221")

## Plot it
plot(st_geometry(yose_watersheds), axes=TRUE)

Next, compute the centroids:

yose_watersheds_ctr <- yose_watersheds %>% st_centroid()

## Plot it
tm_shape(yose_watersheds) + tm_borders() +
  tm_shape(yose_watersheds_ctr) + tm_symbols(size=0.3, col="red")

Example: Buffer the Campgrounds

Let’s put a 1km buffer around each of the campgrounds.

First import the campgrounds (which are part of the yose_poi Shapefile):

## Import and project the campgrounds
yose_campgrnds_utm <- st_read(dsn="./data", layer="yose_poi") %>% 
  dplyr::filter(POITYPE == 'Campground') %>% 
  dplyr::select(POINAME) %>% 
  st_transform(epsg_utm11n_nad83)

Next, buffer them (remember the buffer distance should be in map units):

## Create the 1km buffer
yose_campgrnds_buff <- st_buffer(yose_campgrnds_utm, dist=1000)

## Plot
tm_shape(yose_bnd_utm) +
  tm_borders() +
tm_shape(yose_campgrnds_buff) +
  tm_polygons(col="red") +
tm_layout(title="1km buffer around campgrounds", 
          title.bg.color="white") +
tm_scale_bar(position = c("right", "bottom"))

To create an interior buffer, make the buffer distance negative.

If you want to buffer using a distance in some other unit, you can pass a units object to the dist argument:

## Create a buffer in miles
d <- units::as_units(0.5, "mi")
halfmile_buff <- st_buffer(yose_campgrnds_utm, dist = d)

Logical operators that generate new geometry

Other functions take two input layers and combine them in different ways.

Function Computes
st_union() union of several geometries
st_intersection() intersection of pairs of geometries
st_difference() difference between pairs of geometries
st_sym_difference() symmetric difference (xor, union w/o intersection)

Example: Union

Union all the burn scars for 1980s. What was the total area burned?

First we pull out all the fires for the 1980s:

## Extract fire scars from the 1980s
yose_fires_80s <- yose_firepolys_utm %>% filter(YEAR >= 1980 & YEAR <=1989) 
nrow(yose_fires_80s)

## Plot them
plot(yose_bnd_utm %>% st_geometry(), col=NA, border="red")
plot(yose_fires_80s %>% st_geometry(), col=NA, border="gray50", add=TRUE)
## [1] 767

Next we union the polygons and compute the area.

## Union the polygons (to remove overlapping polygons)
yose_fires_80s_union <- yose_fires_80s %>% st_union()
plot(yose_bnd_utm %>% st_geometry(), col=NA, border="red")
plot(yose_fires_80s_union %>% st_geometry(), col="gray80", border="gray60", add=TRUE)

## Compute the total area, displying the units in square miles
yose_fires_80s_union %>% st_area() %>% set_units(mi2)
## 87.66586 [mi2]

Find the total area burned during the 1990s.

[Solution]

yose_firepolys_utm %>% 
  filter(YEAR >= 1990 & YEAR <=1999) %>% 
  st_union() %>% 
  st_area() %>% 
  set_units(mi2)
## 194.4498 [mi2]

Sometimes you may get a typology error caused by a self-intersecting polygon or something similar. You can repair some types of typology errors with st_make_valid() from the lwgeom package.

Run the following code and you will probably get an error message because one of the fire polygons for the 2000s has a self-intersecton. Un-comment the the st_make_valid() line and try again.

## Find the total area burned during the 2000s
yose_firepolys_utm %>% 
  filter(YEAR >= 2000 & YEAR <=2009) %>% 
  # lwgeom::st_make_valid() %>% 
  st_union() %>% 
  st_area() %>% 
  set_units(mi2)

Summary

Today we saw how to:

  • making geometric measurements such as length and area
  • create new geometries based on existing ones, such as centroids and buffers
  • use logical operators like union and intersection that combine two or more inputs

Additional Resources

sf Vignette 3 - Manipulating Simple Feature Geometries. sf package Vignette