Spatial Data Analysis with R
Society for Conservation GIS, July 2021

Automation with Branches and Loops

The Keys to Automation: Branches and Loops

Branching with if and else Statements

if and else statements are control structures that let you control how a code should execute based on a set of pre-established conditions.

Syntax:

if (condition) {statement} else {other statement}

## Example if statement
x <- 7
if (x > 7) {
  print(x)
} else {
  print("NOT BIG ENOUGH!!")
}
## [1] "NOT BIG ENOUGH!!"

Composite tests with ‘and’ and ‘or’

You can test for multiple conditions using the && (and) and || (or) operators.

x <- 4
y <- 8

(x < 5) && (y < 5)
## [1] FALSE
(x < 5) && (y < 10)
## [1] TRUE
(x < 5) || (y < 10)
## [1] TRUE

if () {…} else if () {…} else {…}

You can string together multiple ‘else if’ checks.

## Generate a sample number between 1 and 20
x <- sample(1:20, size=1)
x

## See what its divisible by
if (x %% 2 == 0) {
  print("You picked an even number")
} else if (x %% 3 == 0) {
  print("You picked a multiple of 3")
} else if (x %% 5 == 0) {
  print("You picked a multiple of 5")
} else {
  print("You picked a prime")
}
## [1] 10
## [1] "You picked an even number"

The ifelse() function

A variation of the if-else statement is the ifelse() function. This function is handy for recoding data.

Syntax:

ifelse(condition, value-if-true, value-if-false)

is_cat <- ifelse(animal=="cat", 1, 0)       

Code Loops

For loops

Generic syntax of a for loop:

for (xx in yy) {

  ## 
  ## lines of code go here
  ##
 
}

You generally loop a variable like i over a vector of values to control the loop:

## Simple 'for' loop
for (i in 1:5){
  print(i ^ 2)
}
## [1] 1
## [1] 4
## [1] 9
## [1] 16
## [1] 25

Often the loop variable is used as an index:

## Using a loop variable as an index to grab elements
for (i in 1:5) {
  print(state.name[i])
}
## [1] "Alabama"
## [1] "Alaska"
## [1] "Arizona"
## [1] "Arkansas"
## [1] "California"

Loops can be combined with if-else statements:

## Loop with an if-then statement in the middle
x <- 1:10

for (i in 1:10) {
    if(x[i] >= 7) { 
        print(x[i])
    } else {                     
      print("NOT BIG ENOUGH!!")
    }
}
## [1] "NOT BIG ENOUGH!!"
## [1] "NOT BIG ENOUGH!!"
## [1] "NOT BIG ENOUGH!!"
## [1] "NOT BIG ENOUGH!!"
## [1] "NOT BIG ENOUGH!!"
## [1] "NOT BIG ENOUGH!!"
## [1] 7
## [1] 8
## [1] 9
## [1] 10

While Loops

The generic form of a while loop is:

while (some_condition) {

  ## 
  ## lines of code go here
  ##
  
}

It’s really important that some_condition will eventually be FALSE, otherwise the loop will run forever!

## Define i
i <- 0

## Go through the loop as long as i is < 0.95
while (i < 0.95) {
  ## Generate i again as a random number between 0..1
  i <- runif(1)
  print(round(i,2))
}
## [1] 0.6
## [1] 0.87
## [1] 0.76
## [1] 0.64
## [1] 0.76
## [1] 0.51
## [1] 0.21
## [1] 0.63
## [1] 0.69
## [1] 0.89
## [1] 0.08
## [1] 0.08
## [1] 0.48
## [1] 0.74
## [1] 0.63
## [1] 0.31
## [1] 0.58
## [1] 0.07
## [1] 0.1
## [1] 0.64
## [1] 0.19
## [1] 0.46
## [1] 0.93
## [1] 0.19
## [1] 0.53
## [1] 0.8
## [1] 0.36
## [1] 0.63
## [1] 0.63
## [1] 0.48
## [1] 0.37
## [1] 0.08
## [1] 0.25
## [1] 0.08
## [1] 0.72
## [1] 0.58
## [1] 0.26
## [1] 0.52
## [1] 0.77
## [1] 0.38
## [1] 0.03
## [1] 0.4
## [1] 0.92
## [1] 0.89
## [1] 0.86
## [1] 0.78
## [1] 0.37
## [1] 0.33
## [1] 0.74
## [1] 0.4
## [1] 0.48
## [1] 0.91
## [1] 0.34
## [1] 0.07
## [1] 0.61
## [1] 0.82
## [1] 0.17
## [1] 0.38
## [1] 0.34
## [1] 0.04
## [1] 0.99

break

Another way to escape out of a loop is the break statement:

while (TRUE) {
  if (something_important_happens) {
    break
  }
}

Looping Through Data Frames

One of the most common uses of a loop is to loop through a data frame.

## Loop thru a data frame
for (i in 1:nrow(Orange)) {
  circum_by_age <- Orange[i, "circumference"]  /  Orange[i, "age"] 
  if (circum_by_age > 0.2) {
    cat("High ratio found in row ", i, "\n")
  }
}
## High ratio found in row  1 
## High ratio found in row  8 
## High ratio found in row  15 
## High ratio found in row  22 
## High ratio found in row  29

Loop through the data frame mtcars, and print out the value of the wt column.

[Solution]

for (i in 1:nrow(mtcars)) {
  print(mtcars[i, "wt"])
}
## [1] 2.62
## [1] 2.875
## [1] 2.32
## [1] 3.215
## [1] 3.44
## [1] 3.46
## [1] 3.57
## [1] 3.19
## [1] 3.15
## [1] 3.44
## [1] 3.44
## [1] 4.07
## [1] 3.73
## [1] 3.78
## [1] 5.25
## [1] 5.424
## [1] 5.345
## [1] 2.2
## [1] 1.615
## [1] 1.835
## [1] 2.465
## [1] 3.52
## [1] 3.435
## [1] 3.84
## [1] 3.845
## [1] 1.935
## [1] 2.14
## [1] 1.513
## [1] 3.17
## [1] 2.77
## [1] 3.57
## [1] 2.78

Recording the Results of Each Iteration

Loops are commonly used to apply an algorithm to each element or row.

You may want to store or save the results of each pass. A common technique is to create a variable to save the results before you start the loop, and then update or append it within the loop.

## Set up a variable to store the cummulative number of assaults
num_assaults <- 0

## Loop thru the USArrests data frame and increment the num_assaults variable
for (i in 1:nrow(USArrests)) {
  num_assaults <- num_assaults + USArrests[i, "Assault"]
}

## Print the final result
print(num_assaults)
## [1] 8538

Looping through Files

Create a character vector of file names:

my_shapefiles <- list.files(path = "./data", pattern = ".shp$")

for (fn in my_shapefiles) {
  print(fn)
  
  ## Or do something with the file, such as:
  ## x <- st_read(file.path("./data", fn)) 
  ## st_write(x %>% st_transform(4326), file.path("./unprojected/", fn))
  
}
## [1] "sf_schools.shp"
## [1] "veg37.shp"
## [1] "yose_boundary.shp"
## [1] "yose_poi.shp"

The pattern argument in list.files() takes a regular expression.
To match files by the end of their name (i.e., extension), add the $ character at the end (e.g., “.shp$”).
To match files by the beginning of their name, add the ^ character at the beginning (e.g., “^sf”)

list.files() has several optional arguments you can use to specify the files returned:

  • recursive
  • include.dirs
  • ignore.case
  • full.names

Another approach is to create a text file with the file names, and loop through those (see next).

Looping through the Lines of a Text File

Another common automation technique is to loop through the lines of a text file, which may contain file names, sets of parameters, etc.

## Define a filename to a text file
txt_fn <- "./data/proverbs.txt"

## Open a 'read-only' connection to the text file
con = file(txt_fn, "r")

## Loop through the lines of the file 
while (TRUE) {
  ## Read the next line with readLines (returns a character object)
  my_line <- readLines(con, n = 1)
  
  ## Check to see if we're at the end
  if (length(my_line) == 0) {
    ## Gotten to the end. Time to exit the loop.
    break
  }
  
  ## Do something
  print(my_line)
}

## Close the connection to the text file
close(con)
## [1] "If a donkey kicks you and you kick back, you are both donkeys. (Gambia)"
## [1] "An adult squatting sees farther than a child on top of tree. (Gambia)"
## [1] "A fly that has no one to advice it, follows the corpse into the grave. (Gambia)"
## [1] "Giant silk cotton trees grow out of very tiny seeds. (Gambia)"
## [1] "However black a cow is, the milk is always white. (Gambia)"
## [1] "The disobedient fowl obeys in a pot of soup (Benin - Nigeria)."
## [1] "When two elephants fight it is the grass that suffers (Uganda)."
## [1] "The frog does not jump in the daytime without reason (Nigeria)."
## [1] "One goat cannot carry another goat's tail (Nigeria)."
## [1] "The family is like the forest, if you are outside it is dense, if you are inside you see that each tree has its own position (Akan)."
## [1] "It is the woman whose child has been eaten by a witch who best knows the evils of witchcraft (Nigeria)."
## [1] "The hunter does not rub himself in oil and lie by the fire to sleep (Nigeria)."
## [1] "The hunter in pursuit of an elephant does not stop to throw stones at birds (Uganda)."
## [1] "If all seeds that fall were to grow, then no one could follow the path under the trees (Akan)."
## [1] "Even the mightest eagle comes down to the tree tops to rest (Uganda)."
## [1] "A tiger does not have to proclaim its tigri-tude (Wole Soyinka - Nigeria)."
## [1] "Before you ask a man for clothes, look at the clothes that he is wearing (Yoruba, Nigeria)."
## [1] "Until lions have their own historians, tales of the hunt shall always glorify the hunter (Igbo, Nigeria)."
## [1] "Although the snake does not fly it has caught the bird whose home is in the sky (Akan)"
## [1] "One should never rub bottoms with a porcupine (Akan)."
## [1] "Fowls will not spare a cockroach that falls in their mist (Akan)."
## [1] "You do not need a big stick to break a cock's head (Akan)."
## [1] "Marriage is like a groundnut, you have to crack them to see what is inside (Akan)."
## [1] "The rain wets the leopard's spots but does not wash them off (Akan)."

CSV files are basically text files that have some structure (i.e., columns, column labels, text delimiters, etc.). They’re generally easier to work with than plain text files, and easy to create. When possible, use a CSV file over a text file.

(Better) Alternatives to Loops

Many R functions are vectorized, so often you don’t actually need a code loop.

- see also Loops Tutorial from DataCamp

Example: add a random offset to each element of sequence

## Addition is vectorized, so we don't need a loop
data.frame(num=1:5, num_plus_offset=1:5 + rnorm(5, mean=0, sd=0.5))
##   num num_plus_offset
## 1   1       0.7445457
## 2   2       2.5150130
## 3   3       3.0001166
## 4   4       3.9911882
## 5   5       5.2077830

In addition to for and while loops, R has functions like sapply() and lapply() that will apply a function to each element of a vector/list.

sapply() and lapply() are generally usually faster than traditional loops.




Next: Custom Functions