Intro to R Part 1:

Getting Started


Andy Lyons
September 27, 2023

https://ucanr-igis.github.io/IntroR_Oct23/



About Me…

About You

Number of registrants: 39

Familiarity with R

Email domain

Location

Goal

Move in the direction of becoming functional with R!!


Learning Strategy

1) Understand foundational terms and concepts

2) Hands-on practice

3) Discover RStudio’s bells and whistles

4) Learn how to get help


Today’s Focus

Learning Tips

1) Watch now, practice later

2) Review what we cover within 24 hours


Workshop Series

Date Session
Sep. 27, 2023
10:00a - 12:00p
Part 1. Getting Started
Oct 4, 2023
10:00a - 12:00p
Part 2. Packages, Functions, and Importing Data
Oct 11, 2023
10:00a - 12:00p
Part 3. Data Wrangling
Oct 18, 2023
10:00a - 12:00p
Part 4. Automation and ggplot


See also Getting Started with R resources.

R and RStudio

Why is R So Popular?

  1. It’s free!

  2. Huge user community (especially academics)

  3. Thousands of add-ons (packages) that extend its capabilities

  4. Particularly strong in plotting and reporting

  5. Once you get over the initial hump, can work very efficiently

  6. Makes it easy to get your code “out there”

  7. Solid overall programming language



Exercise 1: RStudio Exploration and Basic Commands

Exercise 1 Topics

  1. Using R like a fancy calculator
  2. Order of operations
  3. Comparison operators
  4. Saving the results of expressions to variable
  5. Rules for naming variables

RStudio Cloud project for this workshop:

https://posit.cloud/content/6638058

After it opens:


Break!

Exercise Review

Key vocabulary terms are in italic.

Naming Objects

The rules for naming objects are pretty flexible. You can use numbers, letters, and most special characters.

A few rules to take note of:


Naming Styles

There are a handful of popular naming styles. Pick one that you like, and be consistent!

Style Example
alllowercase adjustcolor
period.separated shoe.size
underscore_separated (aka snake case) numeric_version
lowerCamelCase addTaskCallback
UpperCamelCase SignatureMethod

Data Types

All variables have a class or data type, which you can view using class().

num_plots = 10
class(num_plots)
## [1] "numeric"

Other common data types:

Vectors

vectors are R objects that contain multiple values of the same class.

Example:

i = 4:12
i
## [1]  4  5  6  7  8  9 10 11 12


More examples:


Creating Vectors

In general, you need to use a function or operator to create a vector.

Sequence of numbers with the : operator:

1:10
##  [1]  1  2  3  4  5  6  7  8  9 10


Repeat function:

rep("Quercus lobata", 5)
## [1] "Quercus lobata" "Quercus lobata" "Quercus lobata" "Quercus lobata"
## [5] "Quercus lobata"


Combine elements of the same class with c():

yn <- c(TRUE, FALSE, TRUE)
yn
## [1]  TRUE FALSE  TRUE


Some built-in constants are also vectors:

LETTERS
##  [1] "A" "B" "C" "D" "E" "F" "G" "H" "I" "J" "K" "L" "M" "N" "O" "P" "Q" "R" "S"
## [20] "T" "U" "V" "W" "X" "Y" "Z"
state.abb
##  [1] "AL" "AK" "AZ" "AR" "CA" "CO" "CT" "DE" "FL" "GA" "HI" "ID" "IL" "IN" "IA"
## [16] "KS" "KY" "LA" "ME" "MD" "MA" "MI" "MN" "MS" "MO" "MT" "NE" "NV" "NH" "NJ"
## [31] "NM" "NY" "NC" "ND" "OH" "OK" "OR" "PA" "RI" "SC" "SD" "TN" "TX" "UT" "VT"
## [46] "VA" "WA" "WV" "WI" "WY"
month.name
##  [1] "January"   "February"  "March"     "April"     "May"       "June"     
##  [7] "July"      "August"    "September" "October"   "November"  "December"


Random number functions:

runif(20)
##  [1] 0.17782436 0.64642744 0.89121214 0.70146090 0.79845666 0.26657743
##  [7] 0.63596431 0.19446843 0.40912918 0.69756948 0.24045534 0.87716139
## [13] 0.67787150 0.91604563 0.79698219 0.53838195 0.64656791 0.19914492
## [19] 0.09933631 0.99757726
rnorm(20)
##  [1]  0.314045995 -2.168952965 -1.035106500  2.050484172 -0.893925389
##  [6] -0.004147209  0.794782381  0.919572292  1.473193529  1.972762083
## [11]  0.142524334 -0.451737784  0.767048058 -1.034922964 -0.210531267
## [16] -0.528755024  0.016250052  1.439573960 -0.156178807 -0.359327657
sample(month.abb, 3)
## [1] "Aug" "Apr" "Feb"

How Vectors Behave

Vectorized operations

Many R functions and math operators are vectorized (i.e., operate on each individual element).


Examples

First we create two numeric vectors:

x = 0:4
x
## [1] 0 1 2 3 4
y = 11:15
y
## [1] 11 12 13 14 15


Are sin() & cos() vectorized?

sin(x)
## [1]  0.0000000  0.8414710  0.9092974  0.1411200 -0.7568025
cos(x)
## [1]  1.0000000  0.5403023 -0.4161468 -0.9899925 -0.6536436


Addition (and all math functions) is vectorized:

x + 1
## [1] 1 2 3 4 5
x + y
## [1] 11 13 15 17 19


Aggregate functions

Functions that accept a vector and spit out a single value are aggregate.

x = runif(20)
x
##  [1] 0.26211876 0.12979274 0.71034325 0.71292080 0.13439383 0.57130562
##  [7] 0.33741735 0.41426845 0.92729550 0.53577323 0.77293588 0.63427758
## [13] 0.02293783 0.38646294 0.31318852 0.99781761 0.36043713 0.75702638
## [19] 0.73384852 0.06540885


Most descriptive stats functions are aggregate:

mean(x)
## [1] 0.4889985
median(x)
## [1] 0.4750208
sd(x)
## [1] 0.288555


Other aggregate functions:

first(state.name)
## [1] "Alabama"

Subsetting Vectors

To extract a single element from a vector, use square bracket notation. Inside the square brackets, put the index of the element(s) you want.


Subset with indices


LETTERS[2]
## [1] "B"

To return multiple elements, pass a vector of indices.

LETTERS[2:4]
## [1] "B" "C" "D"

You can also use square brackets to extract elements in a different order.

LETTERS[4:2]
## [1] "D" "C" "B"


Subset with logicals

You can also insert a vector of Logical values (TRUE/FALSE) in the brackets. R will return the corresponding element for the TRUE values.

LETTERS[c(T,T,T,T,T,T,T,T,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F,F)]
## [1] "A" "B" "C" "D" "E" "F" "G" "H"


Better still, use an expression that returns a vector of logical values:

state.abb[ substr(state.abb, 1, 1) == "N" ]
## [1] "NE" "NV" "NH" "NJ" "NM" "NY" "NC" "ND"


Plotting Vectors

Base R has simple plotting functions you can use to view the distribution of data.


Histograms

To make a histogram, use hist():

my_vals = rnorm(500)
hist(my_vals)


Box Plots

Prefer a box plot?

boxplot(my_vals)


Scatter Plots

The versatile plot() can be used to make a simple scatter plot:

plot(x = 1:20, y = runif(20))


Line Plots

plot(x = 1:20, y = runif(20), type = "b")


Exercise 2: Working with Scripts and Vectors

Exercise 2 Topics

  1. Saving code in scripts
  2. Creating vectors
  3. Subsetting vectors
  4. Basic plotting

Exercise 2 Review


Scripts

Top five advantages of using scripts over the console:

  1. Easier to write (and fix!) your code
  2. You can add comments to remind yourself what each command is doing
  3. Reuse your own code
  4. You can add loops and if-then statements later on
  5. Tell your friends you’re a coder!


END!