Computing Agroclimate Metrics in R

Part I. Intro and Simple Metrics


December 2, 2022
https://ucanr-igis.github.io/agroclimR/

Start Recording



About Me…

IGIS Team

Open-Source Resources for Decision Support Tools


About You

Type of Organization

Email domain

Location

Familiarity with Agroclimate Metrics

Familiarity with R

Today’s Outline

Part I. Intro and Computationally Simple Metrics


Part II. Cumulative Metrics and Multi-year Summaries


Part III. Agroclimate Metrics with Modeled Climate Data


Workshop Goals

Develop general familiarity with:


Come away with:


Workshop materials:


Agroclimate Metrics

Agroclimate metrics reflect weather driven factors that influence the development of plants and insects.

Examples:

These are abiotic factors but they influence biotic factors that affect crops (like disease).

The main ingredients of metrics are measurements of weather variables (like air temperature).

We can compute them for the past, present, and future.

Why are they useful?

Weather derived metrics are generally good predictors of growth


Pistachio nut embryo length vs degree days
Zhang et al (2021)


Managing Farm Operations

Optimize Pest Management

  • Timing is everything in integrated pest management
  • Save $$ and reduce harmful side effects


Scheduling

  • Irrigation scheduling and amounts
  • When to start deficit irrigation
  • Put out traps
  • Schedule harvest crews
  • Plan for the coming season


Understanding the Effects of Climate Change


https://doi.org/10.1016/j.ufug.2018.07.020
Where can I go today to see the climate my city will have 50 years from now?
How many generations of Navel Orangeworm will growers have to deal with in the coming decades?
https://doi.org/10.1016/j.scitotenv.2020.142657

https://doi.org/doi:10.1371/journal.pone.0020155
How much chill can we expect in the coming decades? Will there be enough to have a economically viable farming operation?
What kind of frost exposure will we see mid-century?
https://doi.org/10.1016/j.scitotenv.2020.143971

https://doi.org/10.3390/agronomy12010205
What is the ‘new normal’ for agroclimatic metrics for specialty crops?

Why would you want to do these in R?

R is an extremely flexible computing environment,
with strengths in:

  • data manipulation
  • visualization
  • data modeling
  • integrating other data
  • widely taught
  • strong user community


Access weather and climate data



owmr: OpenWeatherMap API Wrapper
rnoaa: ‘NOAA’ Weather Data from R
riem: Accesses Weather Data from the Iowa Environment Mesonet
…plus many others

Or write your own!


Weather Data

Station Based

  • Highest quality data are from sensors
    mounted on weather stations
  • Many public and private networks
  • Many are online
  • Closer is better
  • On-site is best
  • Near ground is good

Typical variables:
  • air temperature
  • solar radiation
  • relative humidity
  • dew point
  • precipitation
  • soil temperature
  • soil moisture
  • vapor pressure
  • wind speed
  • wind direction

Interpolated / Gridded Data

  • based on weather station data with modeling
  • PRISM (continental USA, 1979 - present, 800m & 4km)
  • gridMET (continental USA, based on PRISM)
  • Livneh (continental USA, 1950–2013, 6km)

Accessing Data

File server

  • files may cover a large area
  • files may be big
  • you have to read the file and
    extract the AOI yourself

API service

  • communication between computers
  • just get what you need
  • may require registration
  • may require a subscription fee

R-tricks: Piping

Piping syntax is an alternative way of chaining functions together than nested parentheses:

zoo(moo(boo(foo(99),k=7),n=4))

With piping, you use the pipe operator |> or %>% to ‘feed’ the result of one function as the first argument to the next function.


R-tricks: Data wrangling

Whatever is needed to get your data frame ready for the function(s)
you need for analysis and visualization.

  • dropping columns
  • renaming columns
  • changing the order of columns
  • creating new columns with an expression
  • filtering rows
  • sorting rows
  • going from ‘long’ to ‘wide’ formats
  • joining data frames based on a common field
  • stacking / appending data frames
  • splitting tables
  • aggregating rows into groups

The go-to package for wrangling data frames is dplyr:


Common dplyr Functions

subset rows filter(), slice()
order rows arrange(), arrange(desc())
pick column(s) select(), pull()
add new columns mutate()
offset rows lead(), lag()
vectorized conditional checks if_else(), case_when()
join data frames left_join(), right_join(), inner_join()


Chaining dplyr functions

Most dplyr functions take a tibble, and return a tibble.

This makes them very pipe friendly.


Example

Let’s look at the storms tibble (NOAA Atlantic hurricane database, 1975-2000, including position, wind speed, etc. every six hours):

library(dplyr)
head(storms)
## # A tibble: 6 × 13
##   name   year month   day  hour   lat  long status categ…¹  wind press…² tropi…³
##   <chr> <dbl> <dbl> <int> <dbl> <dbl> <dbl> <chr>  <ord>   <int>   <int>   <int>
## 1 Amy    1975     6    27     0  27.5 -79   tropi… -1         25    1013      NA
## 2 Amy    1975     6    27     6  28.5 -79   tropi… -1         25    1013      NA
## 3 Amy    1975     6    27    12  29.5 -79   tropi… -1         25    1013      NA
## 4 Amy    1975     6    27    18  30.5 -79   tropi… -1         25    1013      NA
## 5 Amy    1975     6    28     0  31.5 -78.8 tropi… -1         25    1012      NA
## 6 Amy    1975     6    28     6  32.4 -78.7 tropi… -1         25    1012      NA
## # … with 1 more variable: hurricane_force_diameter <int>, and abbreviated
## #   variable names ¹​category, ²​pressure, ³​tropicalstorm_force_diameter


Which category 3 hurricane had the largest diameter of hurricane force winds?

We can answer this question in one easy-to-read expression:

storms %>% 
  select(name, year, month, day, hour, category, status, hurricane_force_diameter) |>  ## select the columns we need
  filter(category == 3) |>                                                             ## category 3 
  arrange(desc(hurricane_force_diameter)) |>                                           ## arrange descending by diameter
  slice(1:3)                                                                           ## show the top 3
## # A tibble: 3 × 8
##   name    year month   day  hour category status    hurricane_force_diameter
##   <chr>  <dbl> <dbl> <int> <dbl> <ord>    <chr>                        <int>
## 1 Ivan    2004     9    16     0 3        hurricane                      165
## 2 Ivan    2004     9    16     6 3        hurricane                      165
## 3 Nicole  2016    10    13    15 3        hurricane                      160

Note also that with dplyr functions you generally don’t have to put column names in quotes.


R-tricks: Reshaping Data

Reshaping data includes:

The go-to tidyverse package for reshaping data frames is tidyr:


Common tidyr Functions

pivot_wider()

pivot_longer()


R-tricks: Dates

Agroclimate metrics frequently requires working with date values.

These are not Date objects:

x <- "2021-06-15"

y <- "4/1/2019"


To convert date-formatted text to Date objects use as.Date():

bloom_date <- as.Date("2021-04-15")
class(bloom_date)
## [1] "Date"


For everything else…


Finding heat spells with rle()

Heat spells or heat waves are often defined as number of consecutive days where the maximum temperature exceeds a certain threshold.

It is relatively easy to flag which days exceed a threshold:

grwsn_hotyn <- grwsn_dailymax_tbl |> 
  mutate(hotyn = (max_temp > 100))

We can pull out consecutive days of high temperatures with the rle() function.


rle()

Run length encoding is a way to compress data that is i) ordered, and ii) has repetitions. The best way to see how it works is with an illustration:


Now imagine we have time series data where TRUE means a hot day:


To compute this with rle():

hotdays_df <- data.frame(dt = seq(from = as.Date("2020-06-01"), to = as.Date("2020-06-19"), by = 1),
                      hotyn = c(T,T,F,F,F,F,F,T,T,T,T,T,F,F,F,F,T,T,T))
hotdays_df
##            dt hotyn
## 1  2020-06-01  TRUE
## 2  2020-06-02  TRUE
## 3  2020-06-03 FALSE
## 4  2020-06-04 FALSE
## 5  2020-06-05 FALSE
## 6  2020-06-06 FALSE
## 7  2020-06-07 FALSE
## 8  2020-06-08  TRUE
## 9  2020-06-09  TRUE
## 10 2020-06-10  TRUE
## 11 2020-06-11  TRUE
## 12 2020-06-12  TRUE
## 13 2020-06-13 FALSE
## 14 2020-06-14 FALSE
## 15 2020-06-15 FALSE
## 16 2020-06-16 FALSE
## 17 2020-06-17  TRUE
## 18 2020-06-18  TRUE
## 19 2020-06-19  TRUE


First, run the vector in rle():

x <- rle(hotdays_df$hotyn)
x$values
## [1]  TRUE FALSE  TRUE FALSE  TRUE
x$lengths
## [1] 2 5 5 4 3


Now we can write R expressions to pull out what we want.

## Which groups meet both conditions to be a heat wave?
group_is_heatwave <- (x$values == TRUE) & (x$lengths >= 3)
group_is_heatwave
## [1] FALSE FALSE  TRUE FALSE  TRUE
## Number of heatwaves in this period
sum(group_is_heatwave)
## [1] 2


ggplot 101

library(palmerpenguins)
head(penguins)
## # A tibble: 6 × 8
##   species island    bill_length_mm bill_depth_mm flipper_l…¹ body_…² sex    year
##   <fct>   <fct>              <dbl>         <dbl>       <int>   <int> <fct> <int>
## 1 Adelie  Torgersen           39.1          18.7         181    3750 male   2007
## 2 Adelie  Torgersen           39.5          17.4         186    3800 fema…  2007
## 3 Adelie  Torgersen           40.3          18           195    3250 fema…  2007
## 4 Adelie  Torgersen           NA            NA            NA      NA <NA>   2007
## 5 Adelie  Torgersen           36.7          19.3         193    3450 fema…  2007
## 6 Adelie  Torgersen           39.3          20.6         190    3650 male   2007
## # … with abbreviated variable names ¹​flipper_length_mm, ²​body_mass_g


Make a scatter plot:

ggplot(penguins, aes(x = flipper_length_mm, y = bill_length_mm, color = species)) +
  geom_point() +
  ggtitle("Bill Length vs Flipper Length for 3 Species of Penguins")
## Warning: Removed 2 rows containing missing values (`geom_point()`).


Anatomy of a ggplot

R Notebooks

R Notebooks are written in “R Markdown”, which combines text and R code.

Notebook 1: Simple Metrics with CIMIS Data

  1. Import CIMIS weather station data
  2. Time slice by dates
  3. Compute rolling averages
  4. Threshold based metrics
  5. Spells and runs
  6. Multi-variable metrics (e.g., diurnal temperature range)
  7. Irrigation scheduling using reference ET0

RStudio Cloud project for this workshop:

https://posit.cloud/content/5055980

After it opens:


Or see the Completed Notebook #1.

Break!

End