Working with Cal-Adapt Climate Data in R:
Large Queries

Querying Large Data

Imagine you want to extract the climate data for 36,000 vernal pool locations.

Issues that arise when querying large number (1000s) of locations:

General Strategies

1) Aggregate point features by LOCA grid cells

  • The same API call can be used for all points in the same LOCA grid cell

2) Download rasters

  • ca_getrst_stars()
  • Although it will take longer to download, data extraction and geoprocessing may be faster locally

3) Save values in a local SQLite database

  • ca_getvals_db()
  • values get saved as they are received


Saving Values to a Local Database

Use ca_getvals_db() Instead of ca_getvals_tbl()

Sample usage:

my_vals <- my_api_req %>% 
  ca_getvals_db(db_fn = "my_data.sqlite",
                db_tbl = "daily_vals",
                new_recs_only = TRUE)

new_recs_only = TRUE → will pick up where it left off if the connection interrupted

ca_getvals_db() returns a ‘remote tibble’ linked to a local database

Work with ‘remote tibbles’ using many of the same techniques as regular tibbles (with a few exceptions)

ca_db_info() and ca_db_indices() help you view and manage database files

See the Large Queries Vignette for details


Notebook 3: Large Queries

In Notebook 3 you will:

  • query using a sf polygon object
  • download climate values into a SQLite database
  • summarize the values in a remote tibble with dplyr statements

Notebook 3. Large Queries and Rasters | solutions

END!