Here I will reemphasize that learning how to think about working with data is more important than learning how to use specific tools. You can use the dplyr package to get the same results as with plyr but in a slightly different way.

dplyr implements the following verbs useful for data manipulation:

select(): focus on a subset of variables
filter(): focus on a subset of rows
mutate(): add new columns
summarise(): reduce each group to a smaller number of summary statistics
arrange(): re-order the rows
do(): applies any R function to each group of the data

Here we take our data, use thegroup_by function to do the splitting and then summarise each group with a function (here we take the mean of gdpPercap).

library(dplyr)
grouped_gap<-group_by(gapminder,continent)
summarise(grouped_gap,gdp = mean(gdpPercap))
## # A tibble: 5 × 2
##   continent       gdp
##      <fctr>     <dbl>
## 1    Africa  2193.755
## 2  Americas  7136.110
## 3      Asia  7902.150
## 4    Europe 14469.476
## 5   Oceania 18621.609

The way I have written the commands above works, but it difficult to read. The %>% works as a pipe in R the way that | did in bash. It loads with dplyr or the magrittr package. We can pipe the data to group_by and then pipe that to summarise, which makes our workflow more readable. You usually to need to group prior to summarising. You can also group by more than one variable.

gapminder %>% group_by(continent) %>% summarise(gdp = mean(gdpPercap))
## # A tibble: 5 × 2
##   continent       gdp
##      <fctr>     <dbl>
## 1    Africa  2193.755
## 2  Americas  7136.110
## 3      Asia  7902.150
## 4    Europe 14469.476
## 5   Oceania 18621.609

Note this is a tibble, which is “a modern reimagining of the data.frame, keeping what time has proven to be effective, and throwing out what is not.” The result is the same as when we used dlply previously.

You can even use pipes to send your results to ggplot directly.

gapminder %>% group_by(continent) %>% summarise(gdp = mean(gdpPercap)) %>%
  ggplot(aes(x=continent,y=gdp))+geom_point()

Here is an example of filtering - note the simplicity compared to our previous approach to subset data.

gapminder %>% filter(year==2007) %>%
  ggplot(aes(x=continent,y=gdpPercap*pop))+geom_point()+scale_y_log10()

Here is an example of using mutate and sending the resulting output into the pipe, rather than calculating the y value when ggplot is called.

mutate(gapminder,gdp=gdpPercap*pop) %>% filter(year==2007) %>%
  ggplot(aes(x=continent,y=gdp))+geom_point()+scale_y_log10()

Some of this material was taken from the dplyr github readme.