R for reproducible scientific analysis

Loops

Learning Objectives

  • Understand for loops and know when to use them.

Repeating operations

If you want to iterate over a set of values, when the order of iteration is important, and perform the same operation on each, a for loop will do the job. We saw for loops in the shell lessons earlier. This is the most flexible of looping operations, but therefore also the hardest to use correctly. Avoid using for loops unless the order of iteration is important: i.e. the calculation at each iteration depends on the results of previous iterations.

The basic structure of a for loop is:

for(iterator in set of values){
  do a thing
}

For example:

for(i in 1:10){
  print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10

The 1:10 bit creates a vector on the fly; you can iterate over any other vector as well.

We can use a for loop nested within another for loop to iterate over two things at once.

for (i in 1:3){
  for(j in c('a', 'b', 'c')){
    print(paste(i,j))
  }
}
[1] "1 a"
[1] "1 b"
[1] "1 c"
[1] "2 a"
[1] "2 b"
[1] "2 c"
[1] "3 a"
[1] "3 b"
[1] "3 c"

You can see here that the paste function is useful for pasting two items together.

Challenge 1

Write a script that loops through the gapminder data by continent and prints out the continent name and mean life expectancy.

hint: the unique function gets the unique items in a list, and the mean function calculates the mean of a list

for (cont in unique(gapminder$continent)){
  print(cont)
  dat <- gapminder[gapminder$continent == cont, ]
  print(mean(dat$lifeExp))
}