You are currently viewing R Sort DataFrame Rows by Multiple Columns

There are several ways to sort dataframe rows by multiple columns in R. The first method we would be discussing is order() method. Next, we applied with() along with the order() method. setorder() available in data.table library. Finally, we are using arrange() to sort the dataframe in ascending order and using desc() to sort it in descending from the dplyr package.

In a nutshell, to sort data frame rows by multiple columns, I will be using the following methods.

  • order()
  • arrange()
  • setorder()

1. Quick Examples of Sort by Multiple Columns

If you are in a hurry, below are quick examples of how to sort dataframes by multiple columns.


#Create dataframe with 5 rows and 3 columns
df <- data.frame(id=c(2,1,3,4,5),
name=c('sravan','jau','chrisa','shivgami','ram'),
gender=c('f','m','m','f','m'))

#Example 1 - Sort the dataframe by gender and id columns
df2 <- df[order(df$gender, df$id), ]

#Example 2 - Sort the dataframe by gender and id columns
df2 <- df[with(df, order(gender, id)), ]

#Example 3 -  Use the dplyr package
library("dplyr") 
df2 <- arrange(df, gender, id)

#Example 4 - Sort by descending order
library("dplyr") 
df2 <- arrange(df, desc(gender), desc(id) )

#Example 5 - Use the data.table package
library("data.table") 
df2 <- setorder(df, gender, id)

Let’s create an R dataframe with 5 rows and 3 columns.


#Create dataframe with 5 rows and 3 columns
df=data.frame(id=c(2,1,3,4,5),
name=c('sravan','jau','chrisa','shivgami','ram'),
gender=c('f','m','m','f','m'))

#Display dataframe
print(df)

Output:

r sort multiple columns

Let’s see different ways to sort the dataframe rows based on multiple columns.

2. Sort by Multiple Columns in R

order() is the method available in R which will return the dataframe that is sorted based on multiple columns in ascending order. It will take column names through the $ operator. This function takes the ordered column indices. So we have to use [] – index and inside this, we can apply the order() function. Hence this will return the column names.

Syntax:


# Syntax
df[order(df$column1, df$column2),.......... ]
  • df is the input dataframe
  • the column represents the column names.

In this example, we will sort the dataframe by gender and id column.


#Sort the dataframe by gender and id columns
df2 <- df[order(df$gender, df$id), ]
df2

Yields below output.

r sort rows multiple columns

# Output
  id     name gender
1  2   sravan      f
4  4 shivgami      f
2  1      jau      m
3  3   chrisa      m
5  5      ram      m

Notice that the dataframe is sorted based on gender first and for same geneder records it sorts on id column.

3. Sort Rows by Multiple Columns using with()

with() is actually used to return the dataframe by applying some operation on it, the operation would be order(), this will take column names in which the dataframe is to be sorted in ascending order. with() accepts two parameters.


#Syntax
df[with(df, order(column1, column2,..........)), ]

Parameters:

  1. df is the input dataframe
  2. order(columns) represents the column names

Let’s see an example using with() and order() to sort the R DataFrame.


#Sort the dataframe by gender and id columns
print(df[with(df, order(gender, id)), ] )

Output:


# Output
  id     name gender
1  2   sravan      f
4  4 shivgami      f
2  1      jau      m
3  3   chrisa      m
5  5      ram      m

4. Sort Rows by Multiple Columns using arrange()

In R, the arrange() can also be used to sort the dataframe by multiple columns in ascending or descending order, By default it will sort in ascending order. It will take the dataframe name as the first parameter and column names as the next parameter. This method is available in the dplyr() package, so we need to load this package using library("dplyr").

Syntax:


# Syntax
arrange(df, column1, column2,..........)

Parameters:

  1. df is the input dataframe
  2. column... represents the column names

Example 1: Let’s see an example to sort dataframe in ascending & descending order using arrange() function.


#Load the dplyr package
library("dplyr") 

#Sort the dataframe by gender and id columns
print(arrange(df, gender, id) )

Output:


# Output
  id     name gender
1  2   sravan      f
2  4 shivgami      f
3  1      jau      m
4  3   chrisa      m
5  5      ram      m

If you want to sort the R DataFrame in descending order, use the desc() method.

Syntax:


# Syntax
arrange(df, desc(column1), desc(column2),..........)

In this example, we will sort the R dataframe rows by gender and id column in descending order.


#load the dplyr package
library("dplyr") 

#sort the dataframe by gender in descendig order and id columns
print(arrange(df, desc(gender), desc(id)) )

Output:


# Output
  id     name gender
1  5      ram      m
2  3   chrisa      m
3  1      jau      m
4  4 shivgami      f
5  2   sravan      f

5. Sort Rows by Multiple Columns using setorder()

setorder() is also used to order the R dataframe in ascending order. It will take the dataframe name as the first parameter and column names as the next parameters. This method is available in the data.table package, So we need to load this package.

Syntax:


# Syntax
setorder(df, column1, column2,..........)

Parameters:

  1. df is the input dataframe
  2. column… represents the column names

Example: Let’s see an example of using setorder() function.


#load the data.table package
library("data.table") 

#sort the dataframe by gender and id columns
print(setorder(df, gender, id) )

Output:


  id     name gender
1  2   sravan      f
2  4 shivgami      f
3  1      jau      m
4  3   chrisa      m
5  5      ram      m

6. Conclusion

We have seen different methods to sort the R dataframe rows based on multiple columns. The first method we discussed is order() method. Next, we applied with() along with the order() method. setorder() available in data.table library which is also used to sort the dataframe rows by multiple columns. Finally, we have use arrange() to sort the dataframe by multiple columns in ascending order and using desc() to sort it in descending from the dplyr package.

References

  1. order() in R
  2. arrange()
  3. setorder()