How to perform a group by on multiple columns in R DataFrame? By using group_by() function from dplyr package we can perform group by on multiple columns or variables (two or more columns) and summarise on multiple columns for aggregations. Later, I will also explain how to apply summarise() on all columns and finally use multiple aggregation functions together.
1. Quick Examples of Grouping by Multiple Columns
Following are the quick examples of grouping dataframe on multiple columns.
# Group by on multiple columns
agg_tbl <- df %>% group_by(department, state) %>%
summarise(total_salary=sum(salary))
# Summarise on multiple columns
df2<- df[,c("department","state","salary","bonus")]
agg_tbl <- df2 %>% group_by(department, state) %>%
summarise(across(c(salary, bonus), sum))
# Apply multiple summaries
df2<- df[,c("department","state","salary","bonus")]
agg_tbl <- df2 %>% group_by(department, state) %>%
summarise(across(c(salary, bonus), list(mean = mean, sum = sum)))
# Summarise all columns except grouping columns
df2<- df[,c("department","state","age","salary","bonus")]
agg_tbl <- df2 %>% group_by(department, state) %>%
summarise(across(everything(), list(mean = mean, sum = sum)))
Let’s create a DataFrame by reading a CSV file.
# Read CSV file into DataFrame
df = read.csv('/Users/admin/apps/github/r-examples/resources/emp.csv')
df
Yields below output.

2. Group By Multiple Columns in R using dplyr
Use group_by() function in R to group the rows in DataFrame by multiple columns (two or more), to use this function, you have to install dplyr first using install.packages(‘dplyr’) and load it using library(dplyr)
.
All functions in dplyr package take data.frame
as a first argument. When we use dplyr
package, we mostly use the infix operator %>%
from magrittr
, it passes the left-hand side of the operator to the first argument of the operator’s right-hand side. For example, x %>% f(y)
converted into f(x, y)
so the result from the left-hand side is then “piped” into the right-hand side.
I will use infix operator %>%
across all our examples as the output of group_by() function is input to summarise() function.
# Load dplyr
library(dplyr)
# Group by on multiple columns
agg_tbl <- df %>% group_by(department, state) %>%
summarise(total_salary=sum(salary))
agg_tbl
Yields below output. This example does the group by on department
and state
columns and summarises on salary
column and applies the sum
function on each summarised column.

Note that the output of group_by() and summarise() is tibble hence, to convert it to data.frame use as.data.frame()
function.
df2 <- agg_tbl %>% as.data.frame()
3. Grop By & Summarise on Multiple Columns
To perform summarise on multiple columns, create a vector with the column names and use it with across() function.
This example does the group by on department
and state
columns, summarises on salary
& bonus
columns, and apply the sum
function on each summarised column.
# Summarise on multiple columns
agg_tbl <- df %>% group_by(department, state) %>%
summarise(across(c(salary, bonus), sum))
Yields below output.

4. Apply Multiple Summarise Functions
Similarly, you can also perform multiple aggregation functions on all summarise columns in R.
This example does the group by on department
and state
columns, summarises on salary
and bonus
columns, and apply the sum
& mean
functions on each summarised column.
# Apply multiple summaries
agg_tbl <- df %>% group_by(department, state) %>%
summarise(across(c(salary, bonus), list(mean = mean, sum = sum)))
Yields below output.

5. Summarise All Columns Except Grouping Columns
Finally, let’s see how to apply the aggregate functions on all columns of the DataFrame except grouping columns. While doing this make sure your dataframe has only numeric columns plus grouping columns. Having non-numeric on summarise returns an error.
This example does the group by on department
and state
columns, summarises on all columns except grouping columns, and apply the sum
& mean
functions on all summarised columns.
# Summarise all columns except grouping columns
df2<- df[,c("department","state","age","salary","bonus")]
agg_tbl <- df2 %>% group_by(department, state) %>%
summarise(across(everything(), list(mean = mean, sum = sum)))
agg_tbl
Yields below output.

4. Conclusion
In this article, I have explained how to perform group by dataframe on multiple columns and apply different summarising types to get aggregation on grouped data. Since the output of group_by() and summarise() is tibble, use as.data.frame()
function to convert it to data.frame.
Related Articles
- Different Ways to Create a DataFrame in R
- Reorder Columns of DataFrame in R
- R – Join Two or Multiple DataFrames
- How to Read Multiple CSV Files in R
- R Join (Merge) on Multiple Columns
- R Join Multiple Data Frames
- R Write Multiple Lines to Text File