Computing Agroclimate Metrics in R

Part I. Intro and Simple Metrics

December 2, 2022
https://ucanr-igis.github.io/agroclimR/

Start Recording

About Me…

IGIS Team

Open-Source Resources for Decision Support Tools

About You

Type of Organization

Email domain

Location

Familiarity with Agroclimate Metrics

Familiarity with R

Today’s Outline

Part I. Intro and Computationally Simple Metrics

Intro to Agroclimate Metrics
Weather data
R tricks
- data wrangling
- ggplot 101
- working with dates
Notebook #1: Simple Metrics with CIMIS Data

Part II. Cumulative Metrics and Multi-year Summaries

Degree days
Chill portions
R tricks
- Multi-year summaries
- Grouping rows
Notebook #2

Part III. Agroclimate Metrics with Modeled Climate Data

Climate data: LOCA downscaled CIMP5 climate projections
Best Practices with modeled climate data
Climate metrics to business metrics
Stacking and chaining metrics
Bioclimatic analogues
Bioclimatic rasters

Workshop Goals

Develop general familiarity with:

agroclimate metrics
weather data
data wrangling methods in R
specialized R packages

Come away with:

some hands-on experience
working code recipes!

Workshop materials:

Slides: https://ucanr-igis.github.io/agroclimR/
RStudio project: https://posit.cloud/content/5055980
GitHub repo: https://github.com/ucanr-igis/agroclimR

Agroclimate Metrics

Agroclimate metrics reflect weather driven factors that influence the development of plants and insects.

Examples:

extreme heat days
tropical nights
heatwaves
accumulated degree days
frost days
winter chill
dates of first fall freeze & last spring freeze
diurnal temperature range
evapotranspiration

These are abiotic factors but they influence biotic factors that affect crops (like disease).

The main ingredients of metrics are measurements of weather variables (like air temperature).

We can compute them for the past, present, and future.

Why are they useful?

Weather derived metrics are generally good predictors of growth

Pistachio nut embryo length vs degree days
Zhang et al (2021)

Managing Farm Operations

Optimize Pest Management

Timing is everything in integrated pest management
Save $$ and reduce harmful side effects

Scheduling

Irrigation scheduling and amounts
When to start deficit irrigation
Put out traps
Schedule harvest crews
Plan for the coming season

Understanding the Effects of Climate Change

https://doi.org/10.1016/j.ufug.2018.07.020	Where can I go today to see the climate my city will have 50 years from now?
How many generations of Navel Orangeworm will growers have to deal with in the coming decades?	https://doi.org/10.1016/j.scitotenv.2020.142657
https://doi.org/doi:10.1371/journal.pone.0020155	How much chill can we expect in the coming decades? Will there be enough to have a economically viable farming operation?
What kind of frost exposure will we see mid-century?	https://doi.org/10.1016/j.scitotenv.2020.143971
https://doi.org/10.3390/agronomy12010205	What is the ‘new normal’ for agroclimatic metrics for specialty crops?

Why would you want to do these in R?

R is an extremely flexible computing environment,
with strengths in:

data manipulation
visualization
data modeling
integrating other data
widely taught
strong user community

Access weather and climate data

    


owmr:
OpenWeatherMap API Wrapper
 rnoaa: ‘NOAA’
Weather Data from R
 riem: Accesses
Weather Data from the Iowa Environment Mesonet
 …plus many
others

 Or write your own!

Weather Data

Station Based

Highest quality data are from sensors
mounted on weather stations
Many public and private networks
Many are online
Closer is better
On-site is best
Near ground is good

Typical variables:

air temperature
solar radiation
relative humidity
dew point
precipitation
soil temperature
soil moisture
vapor pressure
wind speed
wind direction

Interpolated / Gridded Data

based on weather station data with modeling
PRISM (continental USA, 1979 - present, 800m & 4km)
gridMET (continental USA, based on PRISM)
Livneh (continental USA, 1950–2013, 6km)

Accessing Data

File server

files may cover a large area
files may be big
you have to read the file and
extract the AOI yourself

API service

communication between computers
just get what you need
may require registration
may require a subscription fee

R-tricks: Piping

Piping syntax is an alternative way of chaining functions together than nested parentheses:

zoo(moo(boo(foo(99),k=7),n=4))

With piping, you use the pipe operator |> or %>% to ‘feed’ the result of one function as the first argument to the next function.

Keyboard shortcut for inserting the pipe operator: ctrl + shift + m
You can tell RStudio which pipe to insert under Tools >> Global Options >> Code
|> (‘native’ pipe introduced R4.0)
%>% (traditional, requires magrittr package)

R-tricks: Data wrangling

Whatever is needed to get your data frame ready for the function(s)
you need for analysis and visualization.

dropping columns
renaming columns
changing the order of columns
creating new columns with an expression
filtering rows
sorting rows

going from ‘long’ to ‘wide’ formats
joining data frames based on a common field
stacking / appending data frames
splitting tables
aggregating rows into groups

The go-to package for wrangling data frames is dplyr:

Common `dplyr` Functions

subset rows	filter(), slice()
order rows	arrange(), arrange(desc())
pick column(s)	select(), pull()
add new columns	mutate()
offset rows	lead(), lag()
vectorized conditional checks	if_else(), case_when()
join data frames	left_join(), right_join(), inner_join()

Chaining dplyr functions

Most dplyr functions take a tibble, and return a tibble.

This makes them very pipe friendly.

Example

Let’s look at the storms tibble (NOAA Atlantic hurricane database, 1975-2000, including position, wind speed, etc. every six hours):

library(dplyr)
head(storms)

## # A tibble: 6 × 13
##   name   year month   day  hour   lat  long status categ…¹  wind press…² tropi…³
##   <chr> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <chr>  <ord>   <int>   <int>   <int>
## 1 Amy    1975     6    27     0  27.5 -79   tropi… -1         25    1013      NA
## 2 Amy    1975     6    27     6  28.5 -79   tropi… -1         25    1013      NA
## 3 Amy    1975     6    27    12  29.5 -79   tropi… -1         25    1013      NA
## 4 Amy    1975     6    27    18  30.5 -79   tropi… -1         25    1013      NA
## 5 Amy    1975     6    28     0  31.5 -78.8 tropi… -1         25    1012      NA
## 6 Amy    1975     6    28     6  32.4 -78.7 tropi… -1         25    1012      NA
## # … with 1 more variable: hurricane_force_diameter <int>, and abbreviated
## #   variable names ¹category, ²pressure, ³tropicalstorm_force_diameter

Which category 3 hurricane had the largest diameter of hurricane force winds?

We can answer this question in one easy-to-read expression:

storms %>% 
  select(name, year, month, day, hour, category, status, hurricane_force_diameter) |>  ## select the columns we need
  filter(category == 3) |>                                                             ## category 3 
  arrange(desc(hurricane_force_diameter)) |>                                           ## arrange descending by diameter
  slice(1:3)                                                                           ## show the top 3

## # A tibble: 3 × 8
##   name    year month   day  hour category status    hurricane_force_diameter
##   <chr>  <dbl> <dbl> <int> <dbl> <ord>    <chr>                        <int>
## 1 Ivan    2004     9    16     0 3        hurricane                      165
## 2 Ivan    2004     9    16     6 3        hurricane                      165
## 3 Nicole  2016    10    13    15 3        hurricane                      160

Note also that with dplyr functions you generally don’t have to put column names in quotes.

R-tricks: Reshaping Data

Reshaping data includes:

turning rows into columns (aka pivot tables, cross tab query)
turning columns into rows

The go-to tidyverse package for reshaping data frames is tidyr:

Common `tidyr` Functions

pivot_wider()
 

pivot_longer()
 

R-tricks: Dates

Agroclimate metrics frequently requires working with date values.

These are not Date objects:

x <- "2021-06-15"

y <- "4/1/2019"

To convert date-formatted text to Date objects use as.Date():

bloom_date <- as.Date("2021-04-15")
class(bloom_date)

## [1] "Date"

For everything else…

Finding heat spells with `rle()`

Heat spells or heat waves are often defined as number of consecutive days where the maximum temperature exceeds a certain threshold.

It is relatively easy to flag which days exceed a threshold:

grwsn_hotyn <- grwsn_dailymax_tbl |> 
  mutate(hotyn = (max_temp > 100))

We can pull out consecutive days of high temperatures with the rle() function.

`rle()`

Run length encoding is a way to compress data that is i) ordered, and ii) has repetitions. The best way to see how it works is with an illustration:

Now imagine we have time series data where TRUE means a hot day:

To compute this with rle():

hotdays_df <- data.frame(dt = seq(from = as.Date("2020-06-01"), to = as.Date("2020-06-19"), by = 1),
                      hotyn = c(T,T,F,F,F,F,F,T,T,T,T,T,F,F,F,F,T,T,T))
hotdays_df

##            dt hotyn
## 1  2020-06-01  TRUE
## 2  2020-06-02  TRUE
## 3  2020-06-03 FALSE
## 4  2020-06-04 FALSE
## 5  2020-06-05 FALSE
## 6  2020-06-06 FALSE
## 7  2020-06-07 FALSE
## 8  2020-06-08  TRUE
## 9  2020-06-09  TRUE
## 10 2020-06-10  TRUE
## 11 2020-06-11  TRUE
## 12 2020-06-12  TRUE
## 13 2020-06-13 FALSE
## 14 2020-06-14 FALSE
## 15 2020-06-15 FALSE
## 16 2020-06-16 FALSE
## 17 2020-06-17  TRUE
## 18 2020-06-18  TRUE
## 19 2020-06-19  TRUE

First, run the vector in rle():

x <- rle(hotdays_df$hotyn)
x$values

## [1]  TRUE FALSE  TRUE FALSE  TRUE

x$lengths

## [1] 2 5 5 4 3

Now we can write R expressions to pull out what we want.

## Which groups meet both conditions to be a heat wave?
group_is_heatwave <- (x$values == TRUE) & (x$lengths >= 3)
group_is_heatwave

## [1] FALSE FALSE  TRUE FALSE  TRUE

## Number of heatwaves in this period
sum(group_is_heatwave)

## [1] 2

ggplot 101

library(palmerpenguins)
head(penguins)

## # A tibble: 6 × 8
##   species island    bill_length_mm bill_depth_mm flipper_l…¹ body_…² sex    year
##   <fct>   <fct>              <dbl>         <dbl>       <int>   <int> <fct> <int>
## 1 Adelie  Torgersen           39.1          18.7         181    3750 male   2007
## 2 Adelie  Torgersen           39.5          17.4         186    3800 fema…  2007
## 3 Adelie  Torgersen           40.3          18           195    3250 fema…  2007
## 4 Adelie  Torgersen           NA            NA            NA      NA <NA>   2007
## 5 Adelie  Torgersen           36.7          19.3         193    3450 fema…  2007
## 6 Adelie  Torgersen           39.3          20.6         190    3650 male   2007
## # … with abbreviated variable names ¹flipper_length_mm, ²body_mass_g

Make a scatter plot:

ggplot(penguins, aes(x = flipper_length_mm, y = bill_length_mm, color = species)) +
  geom_point() +
  ggtitle("Bill Length vs Flipper Length for 3 Species of Penguins")

## Warning: Removed 2 rows containing missing values (`geom_point()`).

Anatomy of a ggplot

R Notebooks

R Notebooks are written in “R Markdown”, which combines text and R code.

Notebook 1: Simple Metrics with CIMIS Data

Import CIMIS weather station data
Time slice by dates
Compute rolling averages
Threshold based metrics
Spells and runs
Multi-variable metrics (e.g., diurnal temperature range)
Irrigation scheduling using reference ET₀

RStudio Cloud project for this workshop:

https://posit.cloud/content/5055980

After it opens:

Or see the Completed Notebook #1.

Computing Agroclimate Metrics in R

Part I. Intro and Simple Metrics

Start Recording

About Me…

IGIS Team

Open-Source Resources for Decision Support Tools

About You

Today’s Outline

Part I. Intro and Computationally Simple Metrics

Part II. Cumulative Metrics and Multi-year Summaries

Part III. Agroclimate Metrics with Modeled Climate Data

Workshop Goals

Agroclimate Metrics

Why are they useful?

Weather derived metrics are generally good predictors of growth

Managing Farm Operations

Optimize Pest Management

Scheduling

Understanding the Effects of Climate Change

Why would you want to do these in R?

Weather Data

Station Based

Interpolated / Gridded Data

Accessing Data

R-tricks: Piping

R-tricks: Data wrangling

Common `dplyr` Functions

Chaining dplyr functions

Example

R-tricks: Reshaping Data

Common `tidyr` Functions

R-tricks: Dates

Finding heat spells with `rle()`

`rle()`

ggplot 101

Anatomy of a ggplot

R Notebooks

Notebook 1: Simple Metrics with CIMIS Data

Break!

End

Computing Agroclimate Metrics in R

Part I. Intro and Simple Metrics

Start Recording

About Me…

IGIS Team

Open-Source Resources for Decision Support Tools

About You

Today’s Outline

Part I. Intro and Computationally Simple Metrics

Part II. Cumulative Metrics and Multi-year Summaries

Part III. Agroclimate Metrics with Modeled Climate Data

Workshop Goals

Agroclimate Metrics

Why are they useful?

Weather derived metrics are generally good predictors of growth

Managing Farm Operations

Optimize Pest Management

Scheduling

Understanding the Effects of Climate Change

Why would you want to do these in R?

Weather Data

Station Based

Interpolated / Gridded Data

Accessing Data

R-tricks: Piping

R-tricks: Data wrangling

Common dplyr Functions

Chaining dplyr functions

Example

R-tricks: Reshaping Data

Common tidyr Functions

R-tricks: Dates

Finding heat spells with rle()

rle()

ggplot 101

Anatomy of a ggplot

R Notebooks

Notebook 1: Simple Metrics with CIMIS Data

Break!

End

Common `dplyr` Functions

Common `tidyr` Functions

Finding heat spells with `rle()`

`rle()`