R for reproducible scientific analysis
Loops
Learning Objectives
- Understand
for
loops and know when to use them.
Repeating operations
If you want to iterate over a set of values, when the order of iteration is important, and perform the same operation on each, a for
loop will do the job. We saw for
loops in the shell lessons earlier. This is the most flexible of looping operations, but therefore also the hardest to use correctly. Avoid using for
loops unless the order of iteration is important: i.e. the calculation at each iteration depends on the results of previous iterations.
The basic structure of a for
loop is:
for(iterator in set of values){
do a thing
}
For example:
for(i in 1:10){
print(i)
}
[1] 1
[1] 2
[1] 3
[1] 4
[1] 5
[1] 6
[1] 7
[1] 8
[1] 9
[1] 10
The 1:10
bit creates a vector on the fly; you can iterate over any other vector as well.
We can use a for
loop nested within another for
loop to iterate over two things at once.
for (i in 1:3){
for(j in c('a', 'b', 'c')){
print(paste(i,j))
}
}
[1] "1 a"
[1] "1 b"
[1] "1 c"
[1] "2 a"
[1] "2 b"
[1] "2 c"
[1] "3 a"
[1] "3 b"
[1] "3 c"
You can see here that the paste
function is useful for pasting two items together.
Challenge 1
Write a script that loops through the gapminder
data by continent and prints out the continent name and mean life expectancy.
hint: the unique
function gets the unique items in a list, and the mean
function calculates the mean of a list
for (cont in unique(gapminder$continent)){
print(cont)
dat <- gapminder[gapminder$continent == cont, ]
print(mean(dat$lifeExp))
}