R for reproducible scientific analysis

Vectorisation

Learning Objectives

  • To understand vectorised operations in R.

Most of R’s functions are vectorised, meaning that the function will operate on all elements of a vector without needing to loop through and act on each element one at a time. This makes writing code more concise, easy to read, and less error prone. Remember how when we multiplied two columns of the gapminder data it automatically paired the values?

x <- 1:4
x * 2
[1] 2 4 6 8

The multiplication happened to each element of the vector.

Comparison operators, logical operators, and many functions are also vectorized:

Comparison operators

a <- (x > 2)
a
[1] FALSE FALSE  TRUE  TRUE

Apply functions

We can take advantage of vectorization to apply a function to each item in a list. This is similar to comparing each item in a list to a value or multiplying each pair of values from two lists together. We will use sapply (the simple version of apply) to apply a function to each item in a list.

Challenge 1

Remember our function that checked if a value was greater than 1000? Simplify this function to return TRUE if the value is greater than 1000 and FALSE if it isn’t.

check_gdp<-function(gdp){
  gdp>1000
}

Now we apply this function to each gdpPercap value of our data.

gdp1000<-sapply(gapminder$gdpPercap,check_gdp)

Challenge 2

Write a function that takes the gapminder dataset, gets this TRUE/FALSE list, binds it to the gapminder data, and returns the modified dataset.

newgap <- function(dat){
  gdp1000<-sapply(dat$gdpPercap,check_gdp)
  dat<-cbind(dat,gdp1000)
  return(dat)
}

head(newgap(gapminder))