Write values from an API request to a local database
ca_getvals_db(
x,
db_fn,
db_tbl,
omit_col = NULL,
indices = NULL,
new_recs_only = TRUE,
trans_len = 100,
lookup_tbls = TRUE,
lookup_ret_joined = TRUE,
pause_n = 1000,
pause_secs = 60,
write_sidecar = TRUE,
stop_on_err = TRUE,
quiet = FALSE,
debug = FALSE
)
A Cal-Adapt API request
File name of a SQLite database. See Details.
The name of a database table. See Details.
Columns to exclude from the tibble
Name of fields to index. See Details.
Write new records only to the database. See Details.
Number of APIs calls per write transaction. See Details.
Use lookup tables
Return a table with lookup table fields, ignored if lookup_tbls = FALSE. See Details.
Number of API calls after which a built-in pause is triggered. See Details.
Number of seconds to pause. See Details.
Save table metadata in a separate file. See Details.
Stop if the server returns an error
Suppress messages
Print additional output at the console
A remote tibble linked to the SQLite database.
ca_getvals_db
fetches data from the Cal-Adapt API and writes the data to a SQLite database as they're received. This
allows you to fetch relatively large volumes of data in the background, and potentially over multiple sessions as it will
pick up where it left off if interrupted. Saving the values in a database also reduces the risk of exhausting your RAM.
Use ca_getvals_db
to fetch large volumes of data (i.e., hundreds of thousands of values), or whenever you'd like
to keep a local copy of the data. Note however for small amounts of data there is no advantage to putting it in a database
as it will be slighly slower to retrieve and work with.
db_fn
should be a file name with path to a SQLite database. A SQLIte database is a single file typically
with a .db or .sqlite extension. If the database doesn't exist, it will be created. If it already
exists, the new data will be added to it.
db_tbl
should the name of a table within the database where the new data will be saved. The table name should not
contain special characters and spaces are discouraged. If new_recs_only = TRUE
, only new records will be
added to the database.
trans_len
defines the number of API calls per
SQLite transaction (i.e. how many API calls of data to
accumulate before doing a write operation to the database). This can speed things up. Set it to 0 to disable transactions.
If lookup_tbls = TRUE
, the database will create lookup tables for categorical columns such as GCM, scenario,
cvar, period, slug, etc. This can dramatically reduce the size of the SQLite database file and is generally recommended.
id lookup_ret_joined = TRUE
, the tibble returned will have the lookup tables joined (i.e., column names will
be unaltered); if not the returned tibble will have id values for certain values. A small text file is created for each SQLite
database containing the names of the tables and SQL statement to join them (read automatically
by ca_db_read
).
indices
is a vector of column names in db_tbl
that you'd like indexed (ignored if lookup_tbls = FALSE
).
Creating indices can improve the performance of filters and joins when you generate summaries, but at the cost of a larger
database file and slightly slower write operations. Fields you can create indices on include "feat_id"
(the location id value),
"cvar"
, "gcm"
, "scenario"
, "period"
, "slug"
, and "spag"
.
Indices can also be added to a SQLite database after downloading is complete with ca_db_indices. For large queries (e.g. thousands of
API calls), it is recommended to not build indices during the download process, and only add indices for those fields you
plan to filter on or join during your analysis. You can view which indices exist with ca_db_info
.
pause_n
is the number of API calls after which a built-in pause of length pause_secs
is triggered. This is intended
to avoid disruption on the Cal-Adapt server. The maximum value for pause_n
is 2500, and the minimum value
for pause_secs
is 30 seconds.
The returned tibble is linked to the SQLite datbase. For the most part you can use the same dplyr functions to manipulate the results, but to retrieve the actual values you need to use `collect()`. For more info working with a linked database, see https://dbplyr.tidyverse.org/articles/dbplyr.html.