About this R Notebook

R Notebooks are a ‘flavor’ of R markdown that combine plain text and R commands in code chunks. (You can download the Rmd file from the ‘code’ button at the top of the page.) You run code chunks in the document line-by-line, and the output appears immediately below the code chunk.

If you’re in RStudio, you can minimize the console window (and probably close the right-hand panes as well). You won’t need it, because when you run R commands in a R Notebook the output appears below the code chunk (not the console). This takes some getting used to if you’re used to working at the console.

Keyboard shortcuts you can use within a R Notebook:

  • run the current line of R: ctrl + enter
  • run everything in the current code chunk: ctrl + shift + enter
  • insert a new code chunk: ctrl + alt + i

Chill Portions: Background

‘Chill hours’ and ‘Chill portions’ are commonly used to estimate when tree crops will “wake up” from their winter slumber and start to grow their fruit / nuts. The timing and rate of chill also affects fruit productivity and scheduling of management actions (including harvest). Many tree crops (e.g., nuts) require a certain number of hours at chilly temperatures in order for the tree to blossom and the fruit to ripen. This presents a concern for climate change, because if the winters aren’t cold enough the operation may no longer be economically viable.

Computing ‘chill hours’ essentially involves adding up the total amount of time during the cold season within a temperature range (e.g., 32 - 45 °F). ‘Chill portions’ is a similar calculation, however instead of simply adding up the number of hours within a certain temperature range, additional weights are assigned based on bands of temperature and/or warming periods periods between cold spells (this mimics tree physiology thus producing better predictions).

Computing chill hours and chill portions requires hourly temperature data. You may be lucky enough to have hourly temperature recordings from weather stations for historic and real-time analyses. However for projected climate scenarios, the smallest unit of time for the CMIP5 family of climate models is daily min and max. This creates a conundrum if you want to explore how cumulative chill hours may change due to climate change.

To deal with this, we can estimate hourly temperatures based on the daily minimum and maximum temperature. The model that does this uses the time of sunrise and sunset to model temperature variation over the day. The chillR package, developed by Dr. Eike Luedeling, has functions for modeling hourly temperatures and computing chill portions (see the chillR Vignette for details).

See also:

Luedeling, E., M. Zhang and E. H. Girvetz (2009). Climatic Changes Lead to Declining Winter Chill for Fruit and Nut Trees in California during 1950–2099. PLOS ONE 4(7): e6166.

Exercise Summary

In this example, we:

  1. Download from Cal-Adapt downscaled daily projected minimum and maximum temperatures for a single point on the west side of the San Joaquin Valley from September 2080 thru June 2090 (10 growing seasons), for two emissions scenarios and 10 GCMs (global climate modesl).

  2. Use chillR to compute a) modeled hourly temperatures, and b) accumulated chill portions.

  3. Plot the results

Note: This notebook is primarily intended to demonstrate how to write code to compute chill portions using projected climate data. It is not presented as best practice for assessing the viability of tree crops under climate change!

Want to skip the details? You can ⇊ cut to the chase ⇊.

Setup

The following chunk will install any packages you don’t already have that will be needed below:

pkgs_req <- c("remotes", "ggplot2", "dplyr", "tmap", "conflicted", "tidyr", "lubridate", "tibble", "chillR", "stringr", "magrittr")
The working directory was changed to d:/github/cal-adapt/caladaptr inside a notebook chunk. The working directory will be reset when the chunk is finished running. Use the knitr root.dir option in the setup chunk to change the working directory for notebook chunks.
pkgs_missing <- pkgs_req[!(pkgs_req %in% installed.packages()[,"Package"])]
if (length(pkgs_missing)) install.packages(pkgs_missing, dependencies=TRUE)

Install caladaptr from GitHub (if needed):

## Uncomment and run the following line if you don't have the latest version of caladaptr
## devtools::install_github("ucanr-igis/caladaptr")

Load all the libraries we’ll be using below:

library(caladaptr)
library(units)
library(ggplot2)
library(dplyr)
library(conflicted)
library(tidyr)
library(lubridate)
library(tibble)
library(chillR)
library(magrittr)
library(stringr)
library(sf)

The last setup task is to define your preferences when you’re forced to use an ambiguous function name (i.e., a function that exists in more than one package). This is particularly needed with a few common generic functions from dplyr:

conflict_prefer("filter", "dplyr", quiet = TRUE)
conflict_prefer("count", "dplyr", quiet = TRUE)
conflict_prefer("select", "dplyr", quiet = TRUE)

Getting Started: Chill Portions for a Single Season

1) Create a Cal-Adapt API Request

The first step in getting climate data with caladaptR is to create an API request object. This involves stringing together a series of functions together that specify the pieces of the request.

To illustrate how to write code for this analysis, we’ll create an API request to query a single point, single GCM, and a single scenario for a single season. This of course is not the recommended practice, and you should always look at a minimum 10-20 year time spans and take averages.

## Pick a location
pt1_xy <- c(-119.964, 36.019)

## Create the API Request object
pt1_cap <- ca_loc_pt(coords = pt1_xy) %>%
  ca_gcm(gcms[1]) %>%
  ca_scenario("rcp45") %>%
  ca_period("day") %>%
  ca_dates(start = "2080-09-01", end = "2081-06-30") %>%
  ca_cvar(c("tasmax", "tasmin"))

pt1_cap
Cal-Adapt API Request
Location(s): 
  x: -119.964
  y: 36.019
Variable(s): tasmax, tasmin
Temporal aggregration period(s): day
GCM(s): HadGEM2-ES
Scenario(s): rcp45
Dates: 2080-09-01 to 2081-06-30
 

To verify the location saved in an API request, you can plot it. (Note we still haven’t fetched any data yet, this just shows you the location the request will ask for.)

plot(pt1_cap)

2) Fetch Data from Cal-Adapt

Next we fetch the temperature data with ca_getvals_tbl(), which returns a tibble. We then use mutate() to create a new column with the temperature values in Celsius.

pt1_tbl <- pt1_cap %>% 
  ca_getvals_tbl(quiet = TRUE) %>% 
  mutate(temp_c = set_units(val, degC)) %>% 
  select(id, cvar, period, gcm, scenario, dt, temp_c)

head(pt1_tbl)

3) Wrangle Results into the Format Required by chillR

Guided by chillR’s documentation and Vignette on hourly temperature records, we prepare a five-column tibble with the projected temperature values:

library(tidyr); library(lubridate)

pt1_dyr_tbl <- pt1_tbl %>%
  mutate(DATE = as.POSIXct(format(dt), tz="America/Los_Angeles")) %>%
  mutate(Year = as.integer(year(DATE)), 
         Month = as.integer(month(DATE)), 
         Day = day(DATE),
         temp_c = as.numeric(temp_c)) %>%
  select(cvar, temp_c, Year, Month, Day) %>%
  pivot_wider(names_from = cvar, values_from = temp_c) %>%
  rename(Tmax = tasmax, Tmin = tasmin)

head(pt1_dyr_tbl)

4) Model Hourly Temperature

Now we’re ready to compute the hourly temps. In addition to the daily min/max values, we need to pass the latitude of our site (which is used to determine sunrise and sunset time):

library(chillR)

pt1_hrtmp_wide <- make_hourly_temps(latitude = pt1_xy[2],
                                year_file = pt1_dyr_tbl,
                                keep_sunrise_sunset = FALSE)
head(pt1_hrtmp_wide)

5) Plot Modelled Hourly Temps

To plot the hourly temperatures, we need to convert the ‘wide’ format of the table to a ‘long’ format. Fortunately tidyr has everything we need to pivot between long and wide formats:

pt1_hrtmp_long <- pt1_hrtmp_wide %>%
  pivot_longer(cols = starts_with("Hour_"),
               names_to = "Hour",
               names_prefix = "Hour_",
               names_transform = list(hour = as.integer),
               values_to = "temp_c") %>%
  mutate(date_hour = ISOdatetime(Year, Month, Day, Hour, 0, 0, tz = "America/Los_Angeles")) %>%
  select(date_hour, temp_c) %>%
  arrange(date_hour)

head(pt1_hrtmp_long)

Now we can plot the hourly temperatures. To illustrate we’ll plot just the month of December, 2079:

ts_start <- as.Date("2080-12-01")
ts_end <- as.Date("2080-12-31")

pt1_hrtmp_onewk <- pt1_hrtmp_long %>% 
  filter(date_hour >= ts_start, date_hour <= ts_end)

ggplot(data = pt1_hrtmp_onewk,
       aes(x = date_hour, y = temp_c)) +
  geom_line(aes(color="red"), show.legend = FALSE) +
  labs(title = "Modelled Hourly Temperature for Pt 1: RCP 4.5", 
     subtitle = "December 1-31, 2080",
     caption = "GCM: HadGEM2-ES",
     x = "date", y = "Chill Portion") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))

6) Compute Chill Portions

The Dynamic_Model() function from chillR computes cumulative chill portions. We can add these values to the table as a new column with mutate():

pt1_hrtmpchill_long <- pt1_hrtmp_long %>% 
  mutate(accum_chill_prtn = Dynamic_Model(pt1_hrtmp_long$temp_c))

head(pt1_hrtmpchill_long)

Lastly, we plot the accumulated chill portions with ggplot():

ggplot(data = pt1_hrtmpchill_long,
       aes(x = date_hour, y = accum_chill_prtn)) +
  geom_line(aes(color="red"), show.legend = FALSE) +
  labs(title = "Projected Chill Portion Accumulation for Pt 1: RCP 4.5", 
       subtitle = "Sept 2080 - June 2081",
       caption = "GCM: HadGEM2-ES",
       x = "date", y = "Chill Portion") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))

7) Identify when Accumulated Chill Reaches a Threshhold

Suppose you’re hoping to grow the Kerman variety of Pistachios in 2080, which require [54 accumulated chill hours] (http://fruitsandnuts.ucdavis.edu/Weather_Services/chilling_accumulation_models/CropChillReq/) to ripen. We can easily find the date when the required accumulated chill is predicted to be reached:

chill_min <- 54

chill_date <- pt1_hrtmpchill_long %>% 
  filter(accum_chill_prtn > chill_min) %>% 
  slice(1) %>% 
  pull(date_hour) %>% 
  as.Date()

chill_date
[1] "2081-02-08"

Compare Chill Accumulation for Multiple Growing Seasons, Climate Models, and Emissions Scenarios

Climate models are not crystal balls, so thinking about the consequences of climate change generally involves looking at the predictions from a) a number of climate models (GCMs), 2) multiple emissions scenarios, and 3) multiple years.

Here we’ll compute the date when the chill portion reaches 36 for the 10 GCMs recommended for California, and 10 growing seasons. That should give us 100 maturation dates, which will plot using a box plot. We’ll do this twice, once with RCP 4.5 (low emissions scenario) and RCP 8.5 (higher emission scenario).

1) Create the API Request

## Pick a location
pt1_xy <- c(-119.964, 36.019)

pt1_10yrs_cap <- ca_loc_pt(coords = pt1_xy) %>%
  ca_gcm(gcms[1:10]) %>%
  ca_scenario(c("rcp45", "rcp85")) %>%
  ca_period("day") %>%
  ca_dates(start = "2080-09-01", end = "2090-06-30") %>%
  ca_cvar(c("tasmax", "tasmin"))

pt1_10yrs_cap
Cal-Adapt API Request
Location(s): 
  x: -119.964
  y: 36.019
Variable(s): tasmax, tasmin
Temporal aggregration period(s): day
GCM(s): HadGEM2-ES, CNRM-CM5, CanESM2, MIROC5, ACCESS1-0, CCSM4, CESM1-BGC, CMCC-CMS, GFDL-CM3, HadGEM2-CC
Scenario(s): rcp45, rcp85
Dates: 2080-09-01 to 2090-06-30
 

2) Fetch Data

Our API request asks for >140,000 temperature values, so fetching the data could take a minute or two.

pt1_10yrs_tbl <- pt1_10yrs_cap %>% ca_getvals_tbl(quiet = TRUE) 

dim(pt1_10yrs_tbl)
[1] 143600      8
head(pt1_10yrs_tbl)

3) Munge the Results

This tibble has daily min/max values from September 1 2079 thru June 30, 2090. We need to break it up into groups by 1) growing season, 2) GCM, and 3) RCP. Grouping it up by GCM and RCP will be easy - there are already columns for those. We need to add a column for growing season however, which (for the purposes of this example) we define as September thru June. In other words, every day from Nov 1 2079 thru June 30 2080 should be part of growing season 2080, Nov 2080 thru June 2081 should be part of growing season 2081, and so on.

To construct the growing season column, we’ll first create columns for Year, Month, and Day. Later on, we’ll use these columns in a formula to compute the growing season year.

pt1_10yrs_ymd_tbl <- pt1_10yrs_tbl %>% 
  mutate(Year = as.integer(substr(dt, 1, 4)), 
         Month = as.integer(substr(dt, 6, 7)),
         Day = as.integer(substr(dt, 9, 10))) %>% 
  select(cvar, gcm, scenario, Year, Month, Day, val)

head(pt1_10yrs_ymd_tbl)

We can now add a column for growing season using a case_when statement. While we’re at it, we’ll remove rows that don’t belong to any growing season, and convert the temperature from Kelvin to Celcius.

pt1_10yrs_ymd_gs_tbl <- pt1_10yrs_ymd_tbl %>% 
  mutate(gs = case_when(Month <= 6 ~ Year,
                        Month >= 9 ~ as.integer(Year + 1))) %>% 
  filter(!is.na(gs)) %>% 
  mutate(temp_c = set_units(val, degC))

head(pt1_10yrs_ymd_gs_tbl)

Next, we convert this ‘long’ format into a ‘wide’ format which chillR::make_hourly_temps() requires:

library(tidyr); library(lubridate)

pt1_10yrs_ymd_gcmrcp_gs_tbl <- pt1_10yrs_ymd_gs_tbl %>%
  select(cvar, temp_c, Year, Month, Day, gcm, scenario, gs) %>%
  pivot_wider(names_from = cvar, values_from = temp_c) %>%
  rename(Tmax = tasmax, Tmin = tasmin)

dim(pt1_10yrs_ymd_gcmrcp_gs_tbl)
[1] 60640     8
head(pt1_10yrs_ymd_gcmrcp_gs_tbl)

4) Model Hourly Temperature

Now we’re ready to compute the hourly temps. In addition to the daily min/max values, we need to pass the latitude of our site (which is used to determine sunrise and sunset time):

pt1_10yrs_hrtmp_wide <- make_hourly_temps(latitude = pt1_xy[2],
                                year_file = pt1_10yrs_ymd_gcmrcp_gs_tbl,
                                keep_sunrise_sunset = FALSE)
dim(pt1_10yrs_hrtmp_wide)
[1] 60640    33
head(pt1_10yrs_hrtmp_wide)

Now we go from wide back to long, and recreate the date-time column. This generates over a million rows so it can take several seconds.

pt1_10yrs_hrtmp_long <- pt1_10yrs_hrtmp_wide %>%
  pivot_longer(cols = starts_with("Hour_"),
               names_to = "Hour",
               names_prefix = "Hour_",
               names_transform = list(Hour = as.integer),
               values_to = "temp_c") %>%
  mutate(date_hour = ISOdatetime(Year, Month, Day, Hour, 0, 0, tz = "America/Los_Angeles")) %>%
  arrange(date_hour)

dim(pt1_10yrs_hrtmp_long)
[1] 1455360      12
head(pt1_10yrs_hrtmp_long)

To reality-check, we can plot a portion of the hourly temperature:

ts_start <- as.Date("2081-01-01")
ts_end <- as.Date("2081-03-01")

sample_time_series <- pt1_10yrs_hrtmp_long %>% 
  filter(date_hour >= ts_start, date_hour <= ts_end, 
         gcm == "CanESM2", scenario == "rcp45") %>% 
  arrange(date_hour)

dim(sample_time_series)
[1] 1417   12
ggplot(data = sample_time_series,
       aes(x = date_hour, y = as.numeric(temp_c))) +
  geom_line(aes(color="red"), show.legend = FALSE) +
  labs(title = "Modelled Hourly Temperature", x = "date", y = "temp (C)") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))

5) Compute Chill Portions

Now we’re ready to compute cumulative chill portions. We need to break this up by growing season, gcm and rcp.

pt1_10yrs_hrtmpcp <- pt1_10yrs_hrtmp_long %>% 
  mutate(temp_c = as.numeric(temp_c)) %>% 
  group_by(gs, gcm, scenario) %>% 
  mutate(accum_chill_prtn = Dynamic_Model(temp_c))

head(pt1_10yrs_hrtmpcp)

View the maximum accumulated chill portion per RCP:

pt1_10yrs_maxchill <- pt1_10yrs_hrtmpcp %>% 
  group_by(scenario, gs, gcm) %>% 
  summarise(max_chill = max(accum_chill_prtn), .groups = 'drop') 

pt1_10yrs_maxchill %>% head()

6) Compute Safe Chill

For each RCP, we compute the ‘safe winter chill’ value. ‘Safe winter chill’ is the minimum chill portion one can expect 90% of the time (Luedeling et al, 2009). For growers, safe winter chill reflects the economic need for an orchard operation to produce good yields in most years, rather than an average year. Here it is computed as the 10% quantile of all end-of-season chill portions in the analysis (ie., all GCMs and all growing seasons 2080-2090), with the assumption that all of these outcomes are equally likely.

## Compute safe winter chill for each RCP
safe_chill_tbl <- pt1_10yrs_maxchill %>% 
  group_by(scenario) %>% 
  summarize(safe_chill = quantile(max_chill, probs = 0.1), .groups = "drop")

safe_chill_tbl

Plot distribution of chill portion:

## Plot the histogram and overlay the safe chill value
projected_histo <- ggplot(pt1_10yrs_maxchill, aes(x=max_chill)) + 
  geom_histogram() +
  geom_vline(data = safe_chill_tbl, color = "red", aes(xintercept = safe_chill), size = 1) +
  facet_wrap("scenario", nrow = 1) +
  labs(title = "Maximum Projected Chill Portion", 
     subtitle = "2080 - 2090",
     caption = paste0(
       str_wrap(paste0("GCMs: ", paste(unique(pt1_10yrs_maxchill$gcm), collapse = ", "), width = 105)),
       "\n",
       str_wrap("The red line is the 10% quantile and represents the safe winter chill (the mimimum chill portion one may expect 90% of the time assumming all projections are equally likely).", width = 105)),
     x = "end-of-season chill portion", y = "count") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))

projected_histo

7) How Many Growing Seasons Reach the Minimum Chill Requirement?

Next, for each RCP we compute how many projected temperature series resulted in at least 54 accumulated chill portions needed for a viable crop of the Kerman cultivar of Pistachios:

chill_min <- 54

pt1_10yrs_prct_reached_min <- pt1_10yrs_hrtmpcp %>% 
  group_by(gs, gcm, scenario) %>% 
  summarise(reached_thresh = max(accum_chill_prtn) >= chill_min) %>% 
  ungroup() %>% 
  group_by(scenario) %>% 
  summarise(percent_reached_thresh = sum(reached_thresh) / n()) %>% 
  mutate(time_period = "2081 - 2090", percent_reached_thresh = scales::percent(percent_reached_thresh)) %>% 
  relocate(time_period, scenario, percent_reached_thresh)

pt1_10yrs_prct_reached_min

8) Accumulated Chill Curves

Next, we plot the curves for accumulated chill for RCP 4.5 and 8.5. We have 10 growing seasons and 10 GCMs, for a total of 100 curves. The lines that end above the minimum chill requirement represent simulations where the minimum chill for a viable crop was reached.

## First make a copy of data, manually set the year to the same value (so the dates are in the same range), and keep
## only midnight values

pt1_10yrs_hrtmpcp_4plot <- pt1_10yrs_hrtmpcp %>% 
  filter(Hour == 0) %>% 
  mutate(YearPlot = if_else(Month >= 9, 1970, 1971)) %>% 
  mutate(date_hour = ISOdatetime(YearPlot, Month, Day, Hour, 0, 0, tz = "America/Los_Angeles"),
         gs_gcm = paste(gs, gcm, sep = "_")) %>% 
  select(scenario, gs, gcm, gs_gcm, date_hour, accum_chill_prtn)

## Create plot
ggplot(data = pt1_10yrs_hrtmpcp_4plot %>% filter(scenario == "rcp45"),
       aes(x = date_hour, y = accum_chill_prtn)) +
  geom_line(aes(color=gs_gcm), show.legend = FALSE) +
  geom_hline(yintercept = chill_min, size = 1) +
  labs(title = "Projected Chill Portion Accumulation for Pt 1: RCP 4.5", 
       subtitle = "2080 - 2090",
       caption = str_wrap(paste0("GCMs: ", paste(unique(pt1_10yrs_hrtmpcp_4plot$gcm), collapse = ", "), width = 60)),
       x = "date", y = "Chill Portion") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))

## RCP 85
ggplot(data = pt1_10yrs_hrtmpcp_4plot %>% filter(scenario == "rcp85"),
       aes(x = date_hour, y = accum_chill_prtn)) +
  geom_line(aes(color=gs_gcm), show.legend = FALSE) +
  geom_hline(yintercept = chill_min, size = 1) +
  labs(title = "Projected Chill Portion Accumulation for Pt 1: RCP 8.5", 
       subtitle = "2080 - 2090",
       caption = stringr::str_wrap(paste0("GCMs: ", paste(unique(pt1_10yrs_hrtmpcp_4plot$gcm), collapse = ", "), width = 60)),
       x = "date", y = "Chill Portion") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))

9) Distribution of Required Accumulation Date

Here we plot the distribution of projected dates when the accumulated chill reaches the minimum requirement.

## Number of runs for each emissions scenario = number of growing seasons * number of gcms
num_runs <- length(unique(pt1_10yrs_hrtmpcp$gs)) * length(unique(pt1_10yrs_hrtmpcp$gcm)) 

## For each run, find the first date when the minimum chill portion was reached. 
pt1_10yrs_scen_chilldt <- pt1_10yrs_hrtmpcp %>% 
  filter(Hour == 0) %>% 
  filter(accum_chill_prtn >= chill_min) %>% 
  group_by(gs, gcm) %>% 
  slice_min(order_by = date_hour, n = 1) %>% 
  ungroup() %>%
  mutate(date_1970 = ISOdate(year = 1970, month = Month, day = Day)) %>% 
  select(scenario, date_1970)

## we're getting one NA - investigate
## pt1_10yrs_scen_chilldt %>% filter(is.na(date_1970))  

## Make a data frame for the total number of dates and middle date per scenario 
## (This is for the labels on the box plot)
scen_stats_tbl <- pt1_10yrs_scen_chilldt %>% 
  group_by(scenario) %>% 
  summarize(num_dates = n(), mid_dt = median(date_1970, na.rm = TRUE))
`summarise()` ungrouping output (override with `.groups` argument)
## Make a list object for easy reference. NOT NEEDED ANYMORE
## num_runs_lst <- scen_stats_tbl %$% split(num_dates, scenario) 

## Plot the results
ggplot(pt1_10yrs_scen_chilldt, aes(date_1970 , scenario)) + 
  geom_boxplot() +
  geom_label(data = scen_stats_tbl, 
            aes(label=paste0("n=", num_dates), x = mid_dt + (3.5 * 24 * 3600)), 
            size = 3.5, fill = "white", label.size = 0) +
  labs(title = paste0("Date Accumlated Chill Portion Reaches ", chill_min ), 
       subtitle = "2080 - 2090",
       caption = str_wrap(paste("GCMs: ", paste(unique(pt1_10yrs_hrtmpcp_4plot$gcm), collapse = ", ")), width = 80),
       x = "date", y = "Emissions Scenario") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))

Historical Baseline

When looking at historical data, we can either use modeled or observed. Here we use modeled because 1) modeled data has been shown to matches observed patterns very well, and 2) it facilitates comparisons with modeled predictions.

## Pick a location
pt1_xy <- c(-119.964, 36.019)

pt1_baseline_cap <- ca_loc_pt(coords = pt1_xy) %>%
  ca_gcm(gcms[1:10]) %>%
  ca_scenario(c("historical")) %>%
  ca_period("day") %>%
  ca_dates(start = "1995-01-01", end = "2005-12-31") %>%
  ca_cvar(c("tasmax", "tasmin"))

pt1_baseline_cap
Cal-Adapt API Request
Location(s): 
  x: -119.964
  y: 36.019
Variable(s): tasmax, tasmin
Temporal aggregration period(s): day
GCM(s): HadGEM2-ES, CNRM-CM5, CanESM2, MIROC5, ACCESS1-0, CCSM4, CESM1-BGC, CMCC-CMS, GFDL-CM3, HadGEM2-CC
Scenario(s): historical
Dates: 1995-01-01 to 2005-12-31
 
pt1_baseline_tbl <- pt1_baseline_cap %>% ca_getvals_tbl(quiet = TRUE) 
dim(pt1_baseline_tbl)
[1] 80360     8
head(pt1_baseline_tbl)

Add columns:

pt1_baseline_ymd_tbl <- pt1_baseline_tbl %>% 
  mutate(Year = as.integer(substr(dt, 1, 4)), 
         Month = as.integer(substr(dt, 6, 7)),
         Day = as.integer(substr(dt, 9, 10))) %>% 
  select(cvar, gcm, scenario, Year, Month, Day, val)
pt1_baseline_ymd_tbl

Add growing season column:

pt1_baseline_ymd_gs_tbl <- pt1_baseline_ymd_tbl %>% 
  mutate(gs = case_when(Month <= 6 ~ Year,
                        Month >= 9 ~ as.integer(Year + 1))) %>% 
  filter(!is.na(gs)) %>% 
  mutate(temp_c = set_units(val, degC))
head(pt1_baseline_ymd_gs_tbl)

Pivot wider (make tasmax and tasmin separate columns):

pt1_baseline_ymd_gcmrcp_gs_tbl <- pt1_baseline_ymd_gs_tbl %>%
  select(cvar, temp_c, Year, Month, Day, gcm, scenario, gs) %>%
  pivot_wider(names_from = cvar, values_from = temp_c) %>%
  rename(Tmax = tasmax, Tmin = tasmin)

dim(pt1_baseline_ymd_gcmrcp_gs_tbl)
[1] 33360     8
head(pt1_baseline_ymd_gcmrcp_gs_tbl)
NA

Compute hourly temps (in wide format):

pt1_baseline_hrtmp_wide <- make_hourly_temps(latitude = pt1_xy[2],
                                year_file = pt1_baseline_ymd_gcmrcp_gs_tbl,
                                keep_sunrise_sunset = FALSE)
dim(pt1_baseline_hrtmp_wide)
[1] 33360    33
head(pt1_baseline_hrtmp_wide)

Convert wide to long:

pt1_baseline_hrtmp_long <- pt1_baseline_hrtmp_wide %>%
  pivot_longer(cols = starts_with("Hour_"),
               names_to = "Hour",
               names_prefix = "Hour_",
               names_transform = list(Hour = as.integer),
               values_to = "temp_c") %>%
  mutate(date_hour = ISOdatetime(Year, Month, Day, Hour, 0, 0, tz = "America/Los_Angeles")) %>%
  arrange(date_hour)

dim(pt1_baseline_hrtmp_long)
[1] 800640     12
head(pt1_baseline_hrtmp_long)

Compute baseline chill portions:

pt1_baseline_hrtmpcp <- pt1_baseline_hrtmp_long %>% 
  mutate(temp_c = as.numeric(temp_c)) %>% 
  group_by(gs, gcm, scenario) %>% 
  mutate(accum_chill_prtn = Dynamic_Model(temp_c))

head(pt1_baseline_hrtmpcp)
NA

Compute the max-chill reached each growing season / GCM during the baseline period. Next we compute the ‘safe chill’ (10% quantile):

pt1_baseline_maxchill <- pt1_baseline_hrtmpcp %>% 
  group_by(scenario, gs, gcm) %>% 
  summarise(max_chill = max(accum_chill_prtn), .groups = 'drop') 

dim(pt1_baseline_maxchill)
[1] 120   4
pt1_baseline_maxchill %>% head()

safe_chill_baseline_tbl <- pt1_baseline_maxchill %>% 
  group_by(scenario) %>% 
  summarize(safe_chill = quantile(max_chill, probs = 0.1), .groups = "drop") 

safe_chill_baseline_tbl

Plot distribution of the baseline maximum chill portion:

## Plot the histogram and overlay the safe chill value
baseline_histo <- ggplot(pt1_baseline_maxchill, aes(x=max_chill)) + 
  geom_histogram() +
  geom_vline(data = safe_chill_baseline_tbl, color = "red", aes(xintercept = safe_chill), size = 1) +
  facet_wrap("scenario", nrow = 1) +
  labs(title = "Maximum Projected Chill Portion", 
     subtitle = "1995 - 2005",
     caption = paste0(
       str_wrap(paste0("GCMs: ", paste(unique(pt1_baseline_maxchill$gcm), collapse = ", "), width = 105)),
       "\n",
       str_wrap("The red line is the 10% quantile and represents the safe winter chill (the mimimum chill portion one may expect 90% of the time assumming all projections are equally likely).", width = 105)),
     x = "end-of-season chill portion", y = "count") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))

baseline_histo

How Many Growing Seasons and Simulated (Backcast) Climate Conditions During the Baseline Seasons Reached Our Minimum Chill Requirement?

chill_min <- 54

pt1_baseline_prct_reached_min <- pt1_baseline_hrtmpcp %>% 
  group_by(gs, gcm, scenario) %>% 
  summarise(reached_thresh = max(accum_chill_prtn) >= chill_min) %>% 
  ungroup() %>% 
  group_by(scenario) %>% 
  summarise(percent_reached_thresh = sum(reached_thresh) / n()) %>% 
  mutate(time_period = "1995-2005", percent_reached_thresh = scales::percent(percent_reached_thresh)) %>% 
  relocate(time_period, scenario, percent_reached_thresh)

pt1_baseline_prct_reached_min
NA

Conclusions

In this analysis we examined projected and historic accumulated chill portions for this point:

plot(pt1_10yrs_cap)

We looked at the projected temperatures at this location according to 10 GCMs in the recent historic time period (1995-2005), as well as projected for 2080-2090 under two carbon emissions scenarios (RCP4.5 & 8.5). Our analysis shows that the percent of seasons & models where the maximum chill portion by the end of winter is greater than or equal to the amount of chill needed to grow Kerman variety of Pistachios (54) was:

pt1_baseline_prct_reached_min %>% 
  bind_rows(pt1_10yrs_prct_reached_min)

We also computed the ‘safe-chill’ portion for each of the time periods. This is the amount of chill one can expect to reach 90% of the time:

safe_chill_baseline_tbl %>% bind_rows(safe_chill_tbl)

baseline_histo

projected_histo

Interpretation

This example was primarily an exercise on how to write code, and should be taken with a grain of salt. However the results we came up with suggest that this location may continue to be viable for the Kerman variety of Pistachios in 2080-2090 under emission scenario 4.5, but may not produce enough winter chill portions consistently enough to be economically viable under RCP 8.5.

This analysis was based on the assumption that all 10 GCMs recommended for California under the 4th Climate Change Assessment are equally likely. We also interpolated hourly temperatures based on a generic statistical model that was not specific to this location. We could improve this analysis by:

  1. Selecting GCMs that are best suited for California’s San Joaquin Valley
  2. Reviewing the latest literature to assess whether emission scenario 4.5 or 8.5 seems more likely
  3. Waiting for CMIP6 projected climate data which will include sub-daily temperature values (e.g., every 2 hours)

Questions and Next Steps

Is this a good way to combine / average results from multiple models, multiple years?

Should we use actual or modelled data for the historical baseline?

Would it be useful to also compute chill hours? Other agroclimatic metrics?

Do you always start computing chill portions on Sept 1? Should this be a variable?

Add caveats about not looking at a single model, single-year.

Turn this into a Shiny app with selectable start date and required number of chill portions.

---
title: "Modeling Chill Accumulation Under Climate Change with Cal-Adapt Data"
output:
  html_notebook:
    df_print: paged
    toc: yes
    toc_float: yes
---

```{css echo = FALSE}
h1 {
  font-weight: bold;
  font-size: 24px;
  color: darkolivegreen;
  border-top: 3px solid dimgrey;
  margin-top: 1em;
  padding-top: 0.5em;
}
h1.title {
  color: black;
  border: none;
}
h2 {
  font-weight: bold;
  font-size: 22px;
  color: dimgray;
}

h3 {
  font-weight: bold;
  font-size: 18px;
  color: black;
}

```

<p><img src="https://ucanr-igis.github.io/caladaptr/reference/figures/caladaptr-beta_logo.svg" width="240" /></p>

# About this R Notebook

R Notebooks are a 'flavor' of R markdown that combine plain text and R commands in code chunks. (You can download the Rmd file from the 'code' button at the top of the page.) You run code chunks in the document line-by-line, and the output appears immediately below the code chunk.

If you're in RStudio, you can *minimize the console window* (and probably close the right-hand panes as well). You won't need it,  because when you run R commands in a R Notebook the *output appears below the code chunk* (not the console). This takes some getting used to if you're used to working at the console. 

Keyboard shortcuts you can use within a R Notebook:  

- run the current line of R: *ctrl + enter*  
- run everything in the current code chunk: *ctrl + shift + enter*  
- insert a new code chunk: *ctrl + alt + i*  

# Chill Portions: Background

['Chill hours'](https://en.wikipedia.org/wiki/Chilling_requirement) and 'Chill portions' are commonly used to estimate when tree crops will "wake up" from their winter slumber and start to grow their fruit / nuts. The timing and rate of chill also affects fruit productivity and scheduling of management actions (including harvest). Many tree crops (e.g., nuts) require a certain number of hours at chilly temperatures in order for the tree to blossom and the fruit to ripen. This presents a concern for climate change, because if the winters aren't cold enough the operation may no longer be economically viable. 

Computing 'chill hours' essentially involves adding up the total amount of time during the cold season within a  temperature range (e.g., 32 - 45 &#176;F). 'Chill portions' is a similar calculation, however instead of simply adding up the number of hours within a certain temperature range, additional weights are assigned based on bands of temperature and/or warming periods periods between cold spells (this mimics tree physiology thus producing better predictions).

Computing chill hours and chill portions requires hourly temperature data. You may be lucky enough to have hourly temperature recordings from weather stations for historic and real-time analyses. However for projected climate scenarios, the smallest unit of time for the CMIP5 family of climate models is daily min and max. This creates a conundrum if you want to explore how cumulative chill hours may change due to climate change.

To deal with this, we can estimate hourly temperatures based on the daily minimum and maximum temperature. The model that does this uses the time of sunrise and sunset to model temperature variation over the day. The [**chillR**](https://cran.r-project.org/package=chillR) package, developed by [Dr. Eike Luedeling](http://eikeluedeling.com/), has functions for modeling hourly temperatures and computing chill portions (see the chillR [Vignette](https://cran.r-project.org/web/packages/chillR/vignettes/hourly_temperatures.html) for details).

See also:

Luedeling, E., M. Zhang and E. H. Girvetz (2009). *Climatic Changes Lead to Declining Winter Chill for Fruit and Nut Trees in California during 1950–2099*. [PLOS ONE 4(7): e6166](https://doi.org/10.1371/journal.pone.0006166).

## Exercise Summary

In this example, we:

1) Download from Cal-Adapt downscaled **daily projected minimum and maximum temperatures** for a **single point** on the west side of the San Joaquin Valley from September 2080 thru June 2090 (**10 growing seasons**), for **two emissions scenarios** and **10 GCMs** (global climate modesl).

2) Use chillR to compute a) modeled hourly temperatures, and b) accumulated chill portions.

3) Plot the results

**Note:** This notebook is primarily intended to demonstrate *how to write code* to compute chill portions using projected climate data. It is *not* presented as *best practice* for assessing the viability of tree crops under climate change!

Want to skip the details? You can [**&ddarr; cut to the chase &ddarr;**](#conclusions).

# Setup

The following chunk will install any packages you don't already have that will be needed below:

```{r load_pkgs, cache = TRUE}
pkgs_req <- c("remotes", "ggplot2", "dplyr", "tmap", "conflicted", "tidyr", "lubridate", "tibble", "chillR", "stringr", "magrittr")
pkgs_missing <- pkgs_req[!(pkgs_req %in% installed.packages()[,"Package"])]
if (length(pkgs_missing)) install.packages(pkgs_missing, dependencies=TRUE)
```

Install `caladaptr` from GitHub (if needed):

```{r}
## Uncomment and run the following line if you don't have the latest version of caladaptr
## devtools::install_github("ucanr-igis/caladaptr")
```

Load all the libraries we'll be using below:

```{r library_all, message = FALSE}
library(caladaptr)
library(units)
library(ggplot2)
library(dplyr)
library(conflicted)
library(tidyr)
library(lubridate)
library(tibble)
library(chillR)
library(magrittr)
library(stringr)
library(sf)
```

The last setup task is to define your preferences when you're forced to use an ambiguous function name (i.e., a function that exists in more than one package). This is particularly needed with a few common generic functions from `dplyr`:

```{r set_conflicts}
conflict_prefer("filter", "dplyr", quiet = TRUE)
conflict_prefer("count", "dplyr", quiet = TRUE)
conflict_prefer("select", "dplyr", quiet = TRUE)
```


# Getting Started: Chill Portions for a Single Season

## 1) Create a Cal-Adapt API Request

The first step in getting climate data with caladaptR is to create an API request object. This involves stringing together a series of functions together that specify the pieces of the request. 

To illustrate how to write code for this analysis, we'll create an API request to query a single point, single GCM, and a single scenario for a single season. This of course is not the recommended practice, and you should always look at a minimum 10-20 year time spans and take averages. 

```{r}
## Pick a location
pt1_xy <- c(-119.964, 36.019)

## Create the API Request object
pt1_cap <- ca_loc_pt(coords = pt1_xy) %>%
  ca_gcm(gcms[1]) %>%
  ca_scenario("rcp45") %>%
  ca_period("day") %>%
  ca_dates(start = "2080-09-01", end = "2081-06-30") %>%
  ca_cvar(c("tasmax", "tasmin"))

pt1_cap
```
To verify the location saved in an API request, you can plot it. (Note we still haven't fetched any data yet, this just shows you the location the request will ask for.)

```{r plot_cap1}
plot(pt1_cap)
```

## 2) Fetch Data from Cal-Adapt

Next we fetch the temperature data with `ca_getvals_tbl()`, which returns a tibble. We then use `mutate()` to create a new column with the temperature values in Celsius.

```{r cap1_lst, cache = TRUE}
pt1_tbl <- pt1_cap %>% 
  ca_getvals_tbl(quiet = TRUE) %>% 
  mutate(temp_c = set_units(val, degC)) %>% 
  select(id, cvar, period, gcm, scenario, dt, temp_c)

head(pt1_tbl)
```

## 3) Wrangle Results into the Format Required by `chillR`

Guided by chillR's documentation and Vignette on [hourly temperature records](https://cran.r-project.org/web/packages/chillR/vignettes/hourly_temperatures.html), we prepare a five-column tibble with the projected temperature values:

```{r}
library(tidyr); library(lubridate)

pt1_dyr_tbl <- pt1_tbl %>%
  mutate(DATE = as.POSIXct(format(dt), tz="America/Los_Angeles")) %>%
  mutate(Year = as.integer(year(DATE)), 
         Month = as.integer(month(DATE)), 
         Day = day(DATE),
         temp_c = as.numeric(temp_c)) %>%
  select(cvar, temp_c, Year, Month, Day) %>%
  pivot_wider(names_from = cvar, values_from = temp_c) %>%
  rename(Tmax = tasmax, Tmin = tasmin)

head(pt1_dyr_tbl)
```

## 4) Model Hourly Temperature

Now we're ready to compute the hourly temps. In addition to the daily min/max values, we need to pass the latitude of our site (which is used to determine sunrise and sunset time):

```{r compute_hourly, cache = TRUE, message = FALSE}
library(chillR)

pt1_hrtmp_wide <- make_hourly_temps(latitude = pt1_xy[2],
                                year_file = pt1_dyr_tbl,
                                keep_sunrise_sunset = FALSE)
head(pt1_hrtmp_wide)
```

## 5) Plot Modelled Hourly Temps

To plot the hourly temperatures, we need to convert the 'wide' format of the table to a 'long' format. Fortunately `tidyr` has everything we need to [pivot between long and wide formats](https://tidyr.tidyverse.org/articles/pivot.html){target="_blank" rel="noopener"}: 

```{r}
pt1_hrtmp_long <- pt1_hrtmp_wide %>%
  pivot_longer(cols = starts_with("Hour_"),
               names_to = "Hour",
               names_prefix = "Hour_",
               names_transform = list(hour = as.integer),
               values_to = "temp_c") %>%
  mutate(date_hour = ISOdatetime(Year, Month, Day, Hour, 0, 0, tz = "America/Los_Angeles")) %>%
  select(date_hour, temp_c) %>%
  arrange(date_hour)

head(pt1_hrtmp_long)
```

Now we can plot the hourly temperatures. To illustrate we'll plot just the month of December, 2079:

```{r plot_hourly_temp, cache = TRUE}
ts_start <- as.Date("2080-12-01")
ts_end <- as.Date("2080-12-31")

pt1_hrtmp_onewk <- pt1_hrtmp_long %>% 
  filter(date_hour >= ts_start, date_hour <= ts_end)

ggplot(data = pt1_hrtmp_onewk,
       aes(x = date_hour, y = temp_c)) +
  geom_line(aes(color="red"), show.legend = FALSE) +
  labs(title = "Modelled Hourly Temperature for Pt 1: RCP 4.5", 
     subtitle = "December 1-31, 2080",
     caption = "GCM: HadGEM2-ES",
     x = "date", y = "Chill Portion") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))

```

## 6) Compute Chill Portions

The `Dynamic_Model()` function from `chillR` computes cumulative chill portions. We can add these values to the table as a new column with `mutate()`:  

```{r}
pt1_hrtmpchill_long <- pt1_hrtmp_long %>% 
  mutate(accum_chill_prtn = Dynamic_Model(pt1_hrtmp_long$temp_c))

head(pt1_hrtmpchill_long)
```

Lastly, we plot the accumulated chill portions with `ggplot()`:

```{r}
ggplot(data = pt1_hrtmpchill_long,
       aes(x = date_hour, y = accum_chill_prtn)) +
  geom_line(aes(color="red"), show.legend = FALSE) +
  labs(title = "Projected Chill Portion Accumulation for Pt 1: RCP 4.5", 
       subtitle = "Sept 2080 - June 2081",
       caption = "GCM: HadGEM2-ES",
       x = "date", y = "Chill Portion") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))

```

## 7) Identify when Accumulated Chill Reaches a Threshhold

Suppose you're hoping to grow the Kerman variety of Pistachios in 2080, which require [**54 accumulated chill hours**]
(http://fruitsandnuts.ucdavis.edu/Weather_Services/chilling_accumulation_models/CropChillReq/) to ripen. We can easily find the date when the required accumulated chill is predicted to be reached:


```{r}
chill_min <- 54

chill_date <- pt1_hrtmpchill_long %>% 
  filter(accum_chill_prtn > chill_min) %>% 
  slice(1) %>% 
  pull(date_hour) %>% 
  as.Date()

chill_date
```

# Compare Chill Accumulation for Multiple Growing Seasons, Climate Models, and Emissions Scenarios

Climate models are not crystal balls, so thinking about the consequences of climate change generally involves looking at the predictions from a) a number of climate models (GCMs), 2) multiple emissions scenarios, and 3) multiple years.

Here we'll compute the date when the chill portion reaches 36 for the **10 GCMs** recommended for California, and **10 growing seasons**. That should give us 100 maturation dates, which will plot using a box plot. We'll do this twice, once with RCP 4.5 (low emissions scenario) and RCP 8.5 (higher emission scenario).

## 1) Create the API Request

```{r cap2_make, cache=TRUE}
## Pick a location
pt1_xy <- c(-119.964, 36.019)

pt1_10yrs_cap <- ca_loc_pt(coords = pt1_xy) %>%
  ca_gcm(gcms[1:10]) %>%
  ca_scenario(c("rcp45", "rcp85")) %>%
  ca_period("day") %>%
  ca_dates(start = "2080-09-01", end = "2090-06-30") %>%
  ca_cvar(c("tasmax", "tasmin"))

pt1_10yrs_cap
```

## 2) Fetch Data

Our API request asks for >140,000 temperature values, so fetching the data could take a minute or two.

```{r pt1_10yrs_tbl_fetch, cache = TRUE}
pt1_10yrs_tbl <- pt1_10yrs_cap %>% ca_getvals_tbl(quiet = TRUE) 

dim(pt1_10yrs_tbl)
head(pt1_10yrs_tbl)
```

## 3) Munge the Results

This tibble has daily min/max values from September 1 2079 thru June 30, 2090. We need to break it up into groups by 1) growing season, 2) GCM, and 3) RCP. Grouping it up by GCM and RCP will be easy - there are already columns for those. We need to add a column for growing season however, which (for the purposes of this example) we define as September thru June. In other words, every day from Nov 1 2079 thru June 30 2080 should be part of growing season 2080, Nov 2080 thru June 2081 should be part of growing season 2081, and so on. 

To construct the growing season column, we'll first create columns for Year, Month, and Day. Later on, we'll use these columns in a formula to compute the growing season year.

```{r}
pt1_10yrs_ymd_tbl <- pt1_10yrs_tbl %>% 
  mutate(Year = as.integer(substr(dt, 1, 4)), 
         Month = as.integer(substr(dt, 6, 7)),
         Day = as.integer(substr(dt, 9, 10))) %>% 
  select(cvar, gcm, scenario, Year, Month, Day, val)

head(pt1_10yrs_ymd_tbl)
```

We can now add a column for growing season using a case_when statement. While we're at it, we'll remove rows that don't belong to any growing season, and convert the temperature from Kelvin to Celcius.

```{r pt1_10yrs_grwsn_tbl_make, cache = TRUE}
pt1_10yrs_ymd_gs_tbl <- pt1_10yrs_ymd_tbl %>% 
  mutate(gs = case_when(Month <= 6 ~ Year,
                        Month >= 9 ~ as.integer(Year + 1))) %>% 
  filter(!is.na(gs)) %>% 
  mutate(temp_c = set_units(val, degC))

head(pt1_10yrs_ymd_gs_tbl)
```

Next, we convert this 'long' format into a 'wide' format which `chillR::make_hourly_temps()` requires:

```{r}
library(tidyr); library(lubridate)

pt1_10yrs_ymd_gcmrcp_gs_tbl <- pt1_10yrs_ymd_gs_tbl %>%
  select(cvar, temp_c, Year, Month, Day, gcm, scenario, gs) %>%
  pivot_wider(names_from = cvar, values_from = temp_c) %>%
  rename(Tmax = tasmax, Tmin = tasmin)

dim(pt1_10yrs_ymd_gcmrcp_gs_tbl)
head(pt1_10yrs_ymd_gcmrcp_gs_tbl)
```

## 4) Model Hourly Temperature

Now we're ready to compute the hourly temps. In addition to the daily min/max values, we need to pass the latitude of our site (which is used to determine sunrise and sunset time):

```{r compute_10yrs_hourly, cache = TRUE, message = FALSE}
pt1_10yrs_hrtmp_wide <- make_hourly_temps(latitude = pt1_xy[2],
                                year_file = pt1_10yrs_ymd_gcmrcp_gs_tbl,
                                keep_sunrise_sunset = FALSE)
dim(pt1_10yrs_hrtmp_wide)
head(pt1_10yrs_hrtmp_wide)
```

Now we go from wide back to long, and recreate the date-time column. This generates over a million rows so it can take several seconds.

```{r pt1_10yrs_hrtmp_long, cache = TRUE}
pt1_10yrs_hrtmp_long <- pt1_10yrs_hrtmp_wide %>%
  pivot_longer(cols = starts_with("Hour_"),
               names_to = "Hour",
               names_prefix = "Hour_",
               names_transform = list(Hour = as.integer),
               values_to = "temp_c") %>%
  mutate(date_hour = ISOdatetime(Year, Month, Day, Hour, 0, 0, tz = "America/Los_Angeles")) %>%
  arrange(date_hour)

dim(pt1_10yrs_hrtmp_long)
head(pt1_10yrs_hrtmp_long)
```

To reality-check, we can plot a portion of the hourly temperature:

```{r plot_hourly_temp2, cache = TRUE}
ts_start <- as.Date("2081-01-01")
ts_end <- as.Date("2081-03-01")

sample_time_series <- pt1_10yrs_hrtmp_long %>% 
  filter(date_hour >= ts_start, date_hour <= ts_end, 
         gcm == "CanESM2", scenario == "rcp45") %>% 
  arrange(date_hour)

dim(sample_time_series)

ggplot(data = sample_time_series,
       aes(x = date_hour, y = as.numeric(temp_c))) +
  geom_line(aes(color="red"), show.legend = FALSE) +
  labs(title = "Modelled Hourly Temperature", x = "date", y = "temp (C)") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))

```


## 5) Compute Chill Portions

Now we're ready to compute cumulative chill portions. We need to break this up by growing season, gcm and rcp.

```{r accum_chill_prtn_compute, cache = TRUE}
pt1_10yrs_hrtmpcp <- pt1_10yrs_hrtmp_long %>% 
  mutate(temp_c = as.numeric(temp_c)) %>% 
  group_by(gs, gcm, scenario) %>% 
  mutate(accum_chill_prtn = Dynamic_Model(temp_c))

head(pt1_10yrs_hrtmpcp)
```

View the maximum accumulated chill portion per RCP:

```{r}
pt1_10yrs_maxchill <- pt1_10yrs_hrtmpcp %>% 
  group_by(scenario, gs, gcm) %>% 
  summarise(max_chill = max(accum_chill_prtn), .groups = 'drop') 

pt1_10yrs_maxchill %>% head()
```

## 6) Compute Safe Chill

For each RCP, we compute the 'safe winter chill' value. 'Safe winter chill' is the minimum chill portion one can expect 90% of the time ([Luedeling et al, 2009](https://doi.org/10.1371/journal.pone.0006166)). For growers, safe winter chill reflects the economic need for an orchard operation to produce good yields in most years, rather than an average year. Here it is computed as the 10% quantile of all end-of-season chill portions in the analysis (ie., all GCMs and all growing seasons 2080-2090), with the assumption that all of these outcomes are equally likely.

```{r find_safe_chill, cache=TRUE}
## Compute safe winter chill for each RCP
safe_chill_tbl <- pt1_10yrs_maxchill %>% 
  group_by(scenario) %>% 
  summarize(safe_chill = quantile(max_chill, probs = 0.1), .groups = "drop")

safe_chill_tbl
```

Plot distribution of chill portion:

```{r chill_histo, cache = TRUE}
## Plot the histogram and overlay the safe chill value
projected_histo <- ggplot(pt1_10yrs_maxchill, aes(x=max_chill)) + 
  geom_histogram() +
  geom_vline(data = safe_chill_tbl, color = "red", aes(xintercept = safe_chill), size = 1) +
  facet_wrap("scenario", nrow = 1) +
  labs(title = "Maximum Projected Chill Portion", 
     subtitle = "2080 - 2090",
     caption = paste0(
       str_wrap(paste0("GCMs: ", paste(unique(pt1_10yrs_maxchill$gcm), collapse = ", "), width = 105)),
       "\n",
       str_wrap("The red line is the 10% quantile and represents the safe winter chill (the mimimum chill portion one may expect 90% of the time assumming all projections are equally likely).", width = 105)),
     x = "end-of-season chill portion", y = "count") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))

projected_histo
```

## 7) How Many Growing Seasons Reach the Minimum Chill Requirement?

Next, for each RCP we compute how many projected temperature series resulted in at least `r chill_min` accumulated chill portions needed for a viable crop of the [Kerman cultivar of Pistachios](http://fruitsandnuts.ucdavis.edu/Weather_Services/chilling_accumulation_models/CropChillReq/):

```{r prct_min_chill_reached, cache=TRUE, message=FALSE}
chill_min <- 54

pt1_10yrs_prct_reached_min <- pt1_10yrs_hrtmpcp %>% 
  group_by(gs, gcm, scenario) %>% 
  summarise(reached_thresh = max(accum_chill_prtn) >= chill_min) %>% 
  ungroup() %>% 
  group_by(scenario) %>% 
  summarise(percent_reached_thresh = sum(reached_thresh) / n()) %>% 
  mutate(time_period = "2081 - 2090", percent_reached_thresh = scales::percent(percent_reached_thresh)) %>% 
  relocate(time_period, scenario, percent_reached_thresh)

pt1_10yrs_prct_reached_min
```


## 8) Accumulated Chill Curves

Next, we plot the curves for accumulated chill for RCP 4.5 and 8.5. We have 10 growing seasons and 10 GCMs, for a total of 100 curves. The lines that end above the minimum chill requirement represent simulations where the minimum chill for a viable crop was reached. 

```{r plot_acc_curves_rcp45, cache = TRUE}
## First make a copy of data, manually set the year to the same value (so the dates are in the same range), and keep
## only midnight values

pt1_10yrs_hrtmpcp_4plot <- pt1_10yrs_hrtmpcp %>% 
  filter(Hour == 0) %>% 
  mutate(YearPlot = if_else(Month >= 9, 1970, 1971)) %>% 
  mutate(date_hour = ISOdatetime(YearPlot, Month, Day, Hour, 0, 0, tz = "America/Los_Angeles"),
         gs_gcm = paste(gs, gcm, sep = "_")) %>% 
  select(scenario, gs, gcm, gs_gcm, date_hour, accum_chill_prtn)

## Create plot
ggplot(data = pt1_10yrs_hrtmpcp_4plot %>% filter(scenario == "rcp45"),
       aes(x = date_hour, y = accum_chill_prtn)) +
  geom_line(aes(color=gs_gcm), show.legend = FALSE) +
  geom_hline(yintercept = chill_min, size = 1) +
  labs(title = "Projected Chill Portion Accumulation for Pt 1: RCP 4.5", 
       subtitle = "2080 - 2090",
       caption = str_wrap(paste0("GCMs: ", paste(unique(pt1_10yrs_hrtmpcp_4plot$gcm), collapse = ", "), width = 60)),
       x = "date", y = "Chill Portion") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))
```



```{r plot_acc_curves_rcp85, cache = TRUE}
## RCP 85
ggplot(data = pt1_10yrs_hrtmpcp_4plot %>% filter(scenario == "rcp85"),
       aes(x = date_hour, y = accum_chill_prtn)) +
  geom_line(aes(color=gs_gcm), show.legend = FALSE) +
  geom_hline(yintercept = chill_min, size = 1) +
  labs(title = "Projected Chill Portion Accumulation for Pt 1: RCP 8.5", 
       subtitle = "2080 - 2090",
       caption = stringr::str_wrap(paste0("GCMs: ", paste(unique(pt1_10yrs_hrtmpcp_4plot$gcm), collapse = ", "), width = 60)),
       x = "date", y = "Chill Portion") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))
```

## 9) Distribution of Required Accumulation Date

Here we plot the distribution of projected dates when the accumulated chill reaches the minimum requirement.

```{r}
## Number of runs for each emissions scenario = number of growing seasons * number of gcms
num_runs <- length(unique(pt1_10yrs_hrtmpcp$gs)) * length(unique(pt1_10yrs_hrtmpcp$gcm)) 

## For each run, find the first date when the minimum chill portion was reached. 
pt1_10yrs_scen_chilldt <- pt1_10yrs_hrtmpcp %>% 
  filter(Hour == 0) %>% 
  filter(accum_chill_prtn >= chill_min) %>% 
  group_by(gs, gcm) %>% 
  slice_min(order_by = date_hour, n = 1) %>% 
  ungroup() %>%
  mutate(date_1970 = ISOdate(year = 1970, month = Month, day = Day)) %>% 
  select(scenario, date_1970)

## we're getting one NA - investigate
## pt1_10yrs_scen_chilldt %>% filter(is.na(date_1970))  

## Make a data frame for the total number of dates and middle date per scenario 
## (This is for the labels on the box plot)
scen_stats_tbl <- pt1_10yrs_scen_chilldt %>% 
  group_by(scenario) %>% 
  summarize(num_dates = n(), mid_dt = median(date_1970, na.rm = TRUE))

## Make a list object for easy reference. NOT NEEDED ANYMORE
## num_runs_lst <- scen_stats_tbl %$% split(num_dates, scenario) 

## Plot the results
ggplot(pt1_10yrs_scen_chilldt, aes(date_1970 , scenario)) + 
  geom_boxplot() +
  geom_label(data = scen_stats_tbl, 
            aes(label=paste0("n=", num_dates), x = mid_dt + (3.5 * 24 * 3600)), 
            size = 3.5, fill = "white", label.size = 0) +
  labs(title = paste0("Date Accumlated Chill Portion Reaches ", chill_min ), 
       subtitle = "2080 - 2090",
       caption = str_wrap(paste("GCMs: ", paste(unique(pt1_10yrs_hrtmpcp_4plot$gcm), collapse = ", ")), width = 80),
       x = "date", y = "Emissions Scenario") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))
```


# Historical Baseline

When looking at historical data, we can either use modeled or observed. Here we use modeled because 1) modeled data has been shown to matches observed patterns very well, and 2) it facilitates comparisons with modeled predictions.

```{r cap2baseline_make, cache=TRUE}
## Pick a location
pt1_xy <- c(-119.964, 36.019)

pt1_baseline_cap <- ca_loc_pt(coords = pt1_xy) %>%
  ca_gcm(gcms[1:10]) %>%
  ca_scenario(c("historical")) %>%
  ca_period("day") %>%
  ca_dates(start = "1995-01-01", end = "2005-12-31") %>%
  ca_cvar(c("tasmax", "tasmin"))

pt1_baseline_cap
```

```{r pt1_baseline_tbl_fetch, cache = TRUE}
pt1_baseline_tbl <- pt1_baseline_cap %>% ca_getvals_tbl(quiet = TRUE) 
dim(pt1_baseline_tbl)
head(pt1_baseline_tbl)
```

Add columns:

```{r}
pt1_baseline_ymd_tbl <- pt1_baseline_tbl %>% 
  mutate(Year = as.integer(substr(dt, 1, 4)), 
         Month = as.integer(substr(dt, 6, 7)),
         Day = as.integer(substr(dt, 9, 10))) %>% 
  select(cvar, gcm, scenario, Year, Month, Day, val)
pt1_baseline_ymd_tbl
```

Add growing season column:

```{r baseline_gs, cache = TRUE}
pt1_baseline_ymd_gs_tbl <- pt1_baseline_ymd_tbl %>% 
  mutate(gs = case_when(Month <= 6 ~ Year,
                        Month >= 9 ~ as.integer(Year + 1))) %>% 
  filter(!is.na(gs)) %>% 
  mutate(temp_c = set_units(val, degC))
head(pt1_baseline_ymd_gs_tbl)
```

Pivot wider (make tasmax and tasmin separate columns):

```{r pt1_baseline_ymd_gcmrcp_gs_tbl, cache = TRUE}
pt1_baseline_ymd_gcmrcp_gs_tbl <- pt1_baseline_ymd_gs_tbl %>%
  select(cvar, temp_c, Year, Month, Day, gcm, scenario, gs) %>%
  pivot_wider(names_from = cvar, values_from = temp_c) %>%
  rename(Tmax = tasmax, Tmin = tasmin)

dim(pt1_baseline_ymd_gcmrcp_gs_tbl)
head(pt1_baseline_ymd_gcmrcp_gs_tbl)

```
Compute hourly temps (in wide format):

```{r pt1_baseline_hrtmp_wide, cache = TRUE}
pt1_baseline_hrtmp_wide <- make_hourly_temps(latitude = pt1_xy[2],
                                year_file = pt1_baseline_ymd_gcmrcp_gs_tbl,
                                keep_sunrise_sunset = FALSE)
dim(pt1_baseline_hrtmp_wide)
head(pt1_baseline_hrtmp_wide)
```

Convert wide to long:

```{r pt1_baseline_hrtmp_long, cache = TRUE}
pt1_baseline_hrtmp_long <- pt1_baseline_hrtmp_wide %>%
  pivot_longer(cols = starts_with("Hour_"),
               names_to = "Hour",
               names_prefix = "Hour_",
               names_transform = list(Hour = as.integer),
               values_to = "temp_c") %>%
  mutate(date_hour = ISOdatetime(Year, Month, Day, Hour, 0, 0, tz = "America/Los_Angeles")) %>%
  arrange(date_hour)

dim(pt1_baseline_hrtmp_long)
head(pt1_baseline_hrtmp_long)
```

Compute baseline chill portions:

```{r pt1_baseline_hrtmpcp, cache = TRUE}
pt1_baseline_hrtmpcp <- pt1_baseline_hrtmp_long %>% 
  mutate(temp_c = as.numeric(temp_c)) %>% 
  group_by(gs, gcm, scenario) %>% 
  mutate(accum_chill_prtn = Dynamic_Model(temp_c))

head(pt1_baseline_hrtmpcp)

```

Compute the max-chill reached each growing season / GCM during the baseline period. Next we compute the 'safe chill' (10% quantile):

```{r pt1_baseline_maxchill, cache = TRUE}
pt1_baseline_maxchill <- pt1_baseline_hrtmpcp %>% 
  group_by(scenario, gs, gcm) %>% 
  summarise(max_chill = max(accum_chill_prtn), .groups = 'drop') 

dim(pt1_baseline_maxchill)
pt1_baseline_maxchill %>% head()

safe_chill_baseline_tbl <- pt1_baseline_maxchill %>% 
  group_by(scenario) %>% 
  summarize(safe_chill = quantile(max_chill, probs = 0.1), .groups = "drop") 

safe_chill_baseline_tbl
```

Plot distribution of the baseline maximum chill portion:

```{r chill_baseline_histo, cache = TRUE}
## Plot the histogram and overlay the safe chill value
baseline_histo <- ggplot(pt1_baseline_maxchill, aes(x=max_chill)) + 
  geom_histogram() +
  geom_vline(data = safe_chill_baseline_tbl, color = "red", aes(xintercept = safe_chill), size = 1) +
  facet_wrap("scenario", nrow = 1) +
  labs(title = "Maximum Projected Chill Portion", 
     subtitle = "1995 - 2005",
     caption = paste0(
       str_wrap(paste0("GCMs: ", paste(unique(pt1_baseline_maxchill$gcm), collapse = ", "), width = 105)),
       "\n",
       str_wrap("The red line is the 10% quantile and represents the safe winter chill (the mimimum chill portion one may expect 90% of the time assumming all projections are equally likely).", width = 105)),
     x = "end-of-season chill portion", y = "count") +
  theme(plot.caption = element_text(hjust = 0),
        plot.margin = margin(20, 20, 20, 20),
        plot.background = element_rect(colour = "gray50", size = 3))

baseline_histo
```

How Many Growing Seasons and Simulated (Backcast) Climate Conditions During the Baseline Seasons Reached Our Minimum Chill Requirement?

```{r baseline_prct_min_chill_reached, cache=TRUE, message=FALSE}
chill_min <- 54

pt1_baseline_prct_reached_min <- pt1_baseline_hrtmpcp %>% 
  group_by(gs, gcm, scenario) %>% 
  summarise(reached_thresh = max(accum_chill_prtn) >= chill_min) %>% 
  ungroup() %>% 
  group_by(scenario) %>% 
  summarise(percent_reached_thresh = sum(reached_thresh) / n()) %>% 
  mutate(time_period = "1995-2005", percent_reached_thresh = scales::percent(percent_reached_thresh)) %>% 
  relocate(time_period, scenario, percent_reached_thresh)

pt1_baseline_prct_reached_min

```

# Conclusions

In this analysis we examined projected and historic accumulated chill portions for this point:

```{r pt1_10yrs_cap_plot, cache = TRUE}
plot(pt1_10yrs_cap)
```

We looked at the projected temperatures at this location according to 10 GCMs in the recent historic time period (1995-2005), as well as projected for 2080-2090 under two carbon emissions scenarios (RCP4.5 & 8.5). Our analysis shows that the percent of seasons & models where the maximum chill portion by the end of winter is greater than or equal to the amount of chill needed to grow Kerman variety of Pistachios (`r chill_min`) was:

```{r percnt_reached_all_tbl, cache = TRUE}
pt1_baseline_prct_reached_min %>% 
  bind_rows(pt1_10yrs_prct_reached_min)
```

We also computed the 'safe-chill' portion for each of the time periods. This is the amount of chill one can expect to reach 90% of the time:

```{r histos_repeated, cache = TRUE}
safe_chill_baseline_tbl %>% bind_rows(safe_chill_tbl)

baseline_histo
projected_histo
```

## Interpretation

This example was primarily an exercise on how to write code, and should be taken with a grain of salt. However the results we came up with suggest that this location may continue to be viable for the Kerman variety of Pistachios in 2080-2090 under emission scenario 4.5, but may not produce enough winter chill portions consistently enough to be economically viable under RCP 8.5. 

This analysis was based on the assumption that all 10 GCMs recommended for California under the 4th Climate Change Assessment are equally likely. We also interpolated hourly temperatures based on a generic statistical model that was not specific to this location. We could improve this analysis by:

1. Selecting GCMs that are best suited for California's San Joaquin Valley  
1. Reviewing the latest literature to assess whether emission scenario 4.5 or 8.5 seems more likely
1. Waiting for CMIP6 projected climate data which will include sub-daily temperature values (e.g., every 2 hours)

# Questions and Next Steps

Is this a good way to combine / average results from multiple models, multiple years?

Should we use actual or modelled data for the historical baseline?

Would it be useful to also compute **chill hours**? Other agroclimatic metrics?

Do you always start computing chill portions on Sept 1? Should this be a variable?

Add caveats about not looking at a single model, single-year.

Turn this into a Shiny app with selectable start date and required number of chill portions. 


