R for reproducible scientific analysis
Vectorisation
Learning Objectives
- To understand vectorised operations in R.
Most of R’s functions are vectorised, meaning that the function will operate on all elements of a vector without needing to loop through and act on each element one at a time. This makes writing code more concise, easy to read, and less error prone. Remember how when we multiplied two columns of the gapminder data it automatically paired the values?
x <- 1:4
x * 2
[1] 2 4 6 8
The multiplication happened to each element of the vector.
Comparison operators, logical operators, and many functions are also vectorized:
Comparison operators
a <- (x > 2)
a
[1] FALSE FALSE TRUE TRUE
Apply functions
We can take advantage of vectorization to apply
a function to each item in a list. This is similar to comparing each item in a list to a value or multiplying each pair of values from two lists together. We will use sapply
(the simple version of apply) to apply a function to each item in a list.
Challenge 1
Remember our function that checked if a value was greater than 1000? Simplify this function to return TRUE if the value is greater than 1000 and FALSE if it isn’t.
check_gdp<-function(gdp){
gdp>1000
}
Now we apply this function to each gdpPercap value of our data.
gdp1000<-sapply(gapminder$gdpPercap,check_gdp)
Challenge 2
Write a function that takes the gapminder dataset, gets this TRUE/FALSE list, binds it to the gapminder data, and returns the modified dataset.
newgap <- function(dat){
gdp1000<-sapply(dat$gdpPercap,check_gdp)
dat<-cbind(dat,gdp1000)
return(dat)
}
head(newgap(gapminder))