Spatial Data Analysis with R
Society for Conservation GIS, July 2021

Working with Files and Folders

Working Directory

When you import or export objects to disk using a script, you don’t have the luxury of an ‘Open…’ or ‘Save As…’ dialog. You have to specify where it’s coming from (or going to).

All file paths are relative to the working directory. You can view the working directory with getwd():

## View the working directory
getwd()

‘directory’ = ‘folder’

To see a list of files in the current working directory, use list.files(x), where x is a directory.

## List the files in the current working directory
## Note '.' is shorthand for the current working directory
list.files(".")

list.files() also lists directories

Optional arguments let you specify a pattern, recurse subdirectories, etc.


To list files in a subdirectory of the current directory:

list.files("./mysubfolder")

Here ‘.’ tells R to start at the working directory, and then go into ‘mysubfolder’.

R always requires forward slashes (/), not back slashes (\). Sorry Windows users!

one way to change them in your code:

  • select the path in your code
  • ctrl-f (Find)
  • find \ replace /
  • ‘Replace all’


List all the Shapefiles in the data directory.

## List the Shapefiles in the 'data' directory
list.files("./data", pattern = "*.shp")

Absolute File Paths

An absolute file path starts with '/', which indicates the root directory. It may also include a volume (drive letter).

Examples:

## Define a file name that begins with the root directory (on the current drive / volumne)
x <- "/temp/test.csv"

## See if it exists
file.exists(x)
## [1] FALSE
## Another one, with a drive letter
x <- "E:/downloads/my_notes.txt"
file.exists(x)
## [1] FALSE

To ensure you didn’t type a filename incorrectly, use file.exists()

File names on Windows are not case sensitive, but for compatibility across platforms make it a practice to match the case.

Importing Files

Example: Importing a csv

You can import a csv file using read.csv(x), where x is the name of a file.

## First, save the path and file name to a variable
csv_fn <- "./data/sf_libraries.csv"

## See if it exists
file.exists(csv_fn)

## Import
my_data <- read.csv(csv_fn)

What type of object (class) does read.csv() return?

Answer

class(my_data)
## [1] "data.frame"

Changing the Working Directory

You can change the working directory with setwd().

setwd("/projects/soc101/sf_census/")

You can also set the working directory from the RStudio Session menu.

If you create a RStudio Project, the working directory defaults to the Project folder.

‘Home’ Directory

The ‘Home’ directory is usually your Documents folder. Example;

C:/Users/Andy/Documents

You can tell R to start in the Home directory with the ~ character:

## List files in your 'Home' directory
list.files("~")

Convenience Functions

file.choose()

allows you to pick a file using a file selection dialog, then returns the name.

## Select a file
x <- file.choose()

## View x
x


file.path()

lets you concatenate folders and files to construct a syntactically valid name:

## Construct a complete file name from a set of directories and a file name
file.path("~", "BaseLayers", "California", "Parks.shp")
## [1] "~/BaseLayers/California/Parks.shp"


file_path_as_absolute()

(from the tools package) shows you the absolute path for any folder:

## View the expanded path of the 'data' folder
tools::file_path_as_absolute("./data")

Saving R Objects to Disk

You can save individual objects (variables) to disk with save().

## Create 1000 random values
rnd_vals <- rnorm(1000) * 20

## Save to the Home directory
save(rnd_vals, file="~/my_random_numbers.RData")

.RData and .Rda are common extensions for R data files, but you can name your file anything you want.

To save multiple objects to the same file, use the list argument:

save(list = c(“x”, “y”), file = “~/xy.RData”)

To save your entire workspace (all variables in memory), use save.image(). RStudio also has a ‘Save Workspace’ button on the ‘Environment’ window, and will probably ask if you want to save your workspace when you quit the program.


Load Saved Objects Back into R

save() stores objects in a binary, compressed file format that R understands. This makes it very flexible and easy to bring back into R using the sister function load().

## Delete rnd_vals from memory
rm(rnd_vals)
ls()

## Load the saved file
load("~/my_random_numbers.RData")

## View objects in memory. (rnd_vals should be back!)
ls()

RStudio Files Pane

In RStudio, use the Files tab to browse your files, open R files, and set the working directory.

You can open many files by clicking on them in Files pane:

*.R, *.txt, *.css - clicking will open for editing
*.RData, *.rda - clicking will import them into RStudio

Saving and Sourcing Scripts

You can save and open scripts, R Markdown files, etc. from the RStudio File menu.

If you have a set of commands that you want to rerun repeatedly, you can save them as a *.R file and then run them all at once with source() function.

Example: to run all the commands in a script named import_clean_data.R:

source("import_clean_data.R")

source() automatically runs all the commands in a script, so only use it with scripts that are ‘ready-to-go’.

To run commands one-by-one, open the script in RStudio and use the ‘run’ button.

R Notebook Exercise

nb_files-folders_scgis21.Rmd
Learn how to work with files and folders

preview notebook | answer key

Summary

Importing and exporting data typically requires passing a file name and/or a directory name.

File names are in reference to the current Working Directory, which you specify with the ‘.

The ‘Home’ Directory ('~') is a good place to store things you use frequently.

Convenience functions like file.exists() and list.files() can make your life easier.




Next: Importing and Plotting Vector Data