Remember how we wrote a function to calculate GDP for a given dataset limited by country and year?
# Takes a dataset and multiplies the population column
# with the GDP per capita column.
calcGDP <- function(dat, year=NULL, country=NULL) {
if(!is.null(year)) {
dat <- dat[dat$year %in% year, ]
}
if (!is.null(country)) {
dat <- dat[dat$country %in% country,]
}
gdp <- dat$pop * dat$gdpPercap
new <- cbind(dat, gdp=gdp)
return(new)
}
When we use this function we know that it takes a dataset with columns labeled pop
and gdpPercap
. We can check that this function works for some different datasets.
gapm_wgdp <- calcGDP(gapminder[1:20,])
gapm_wgdp <- calcGDP(gapminder %>% filter(continent=='Asia'), country ='Afghanistan')
What if we accidently misspell Afghanistan?
gapm_wgdp <- calcGDP(gapminder %>% filter(continent=='Asia'), country ='Afganistan')
This code runs fine but doesn’t output any results and we don’t know why. It would be helpful if we were notified that we did something wrong.
# Takes a dataset and multiplies the population column
# with the GDP per capita column.
calcGDP <- function(dat, year=NULL, country=NULL) {
if(!is.null(year)) {
dat <- dat[dat$year %in% year, ]
}
if (!is.null(country)) {
if (country %in% dat$country){
dat <- dat[dat$country %in% country,]
}
else{
stop('I am sorry, but ', country, ' is not in this dataset')
}
}
gdp <- dat$pop * dat$gdpPercap
new <- cbind(dat, gdp=gdp)
return(new)
}
gapm_wgdp <- calcGDP(gapminder, country ='Afganistan')
Now we have an error that stopped our script instead of continuing with absent data.
Similarly, what if we accidently mixed up the order of year and country.
gapm_wgdp <- calcGDP(gapminder, 'Afghanistan',2007)
This gives an error that might confuse you. You know that 2007 is in the dataset. Let’s tell the function to stop if the year isn’t a number.
# Takes a dataset and multiplies the population column
# with the GDP per capita column.
calcGDP <- function(dat, year=NULL, country=NULL) {
if(!is.null(year)) {
if (!is.numeric(year)){
stop(year, ' is not a number')
}
dat <- dat[dat$year %in% year, ]
}
if (!is.null(country)) {
if (country %in% dat$country){
dat <- dat[dat$country %in% country,]
}
else{
stop('I am sorry, but ', country, ' is not in this dataset')
}
}
gdp <- dat$pop * dat$gdpPercap
new <- cbind(dat, gdp=gdp)
return(new)
}
Now we get a more useful message.
gapm_wgdp <- calcGDP(gapminder, 'Afghanistan',2007)
One of the advantages of putting code into functions is that we can test each unit of code to make sure it works properly. Each time we broke the code we added a test to the function to make sure we’re notified next time we break the function this way. This is good programming practice. (There are other ways to test funtions, but this is a nice simple one for now.)
Challenge: Now you find a way to break the code, then add a test to make sure that you would know if you committed this error again and would not go ahead with incorrect results.