R – Sort DataFrame Rows by Multiple Columns

There are several ways to sort R dataframe rows by multiple columns. The first method we would be discussing is order() method. Next, we applied with() along with the order() method. setorder() available in data.table library which is also used to sort the dataframe rows by multiple columns. Finally, we are using arrange() to sort the dataframe by multiple columns in ascending order and using desc() to sort it in descending from the dplyr package.

1. Quick Examples

If you are in a hurry, below are quick examples.


#Create dataframe with 5 rows and 3 columns
my_dataframe=data.frame(id=c(2,1,3,4,5),
name=c('sravan','jau','chrisa','shivgami','ram'),
gender=c('f','m','m','f','m'))

#Example 1 - Sort the dataframe by gender and id columns
print(my_dataframe[order(my_dataframe$gender, my_dataframe$id), ])

#Example 2 - Sort the dataframe by gender and id columns
print(my_dataframe[with(my_dataframe, order(gender, id)), ] )

#Example 3 -  Use the dplyr package
library("dplyr") 
print(arrange(my_dataframe, gender, id) )

#Example 4 - Sort by descending order
library("dplyr") 
print(arrange(my_dataframe, desc(gender), desc(id) ))

#Example 5 - Use the data.table package
library("data.table") 
print(setorder(my_dataframe, gender, id) )

Let’s create a R dataframe with 5 rows and 3 columns.


#Create dataframe with 5 rows and 3 columns
my_dataframe=data.frame(id=c(2,1,3,4,5),
name=c('sravan','jau','chrisa','shivgami','ram'),
gender=c('f','m','m','f','m'))

#Display dataframe
print(my_dataframe)

Output:


# Output
  id     name gender
1  2   sravan      f
2  1      jau      m
3  3   chrisa      m
4  4 shivgami      f
5  5      ram      m

Let’s see different ways to sort the dataframe rows based on multiple columns.

2. Sort DataFrame Rows by Multiple Columns using order()

order() is the method available in R which will return the dataframe that is sorted based on multiple columns in ascending order. It will take column names through the $ operator.

order() will return the ordered column indices. So we have to use [] – index and inside this, we can apply the order() function. Hence this will return the column names.

Syntax:


# Syntax
my_dataframe[order(my_dataframe$column1, my_dataframe$column2),.......... ]
  • my_dataframe is the input dataframe
  • the column represents the column names.

In this example, we will sort the dataframe by gender and id column.


#Sort the dataframe by gender and id columns
print(my_dataframe[order(my_dataframe$gender, my_dataframe$id), ])

Yields below output.


# Output
  id     name gender
1  2   sravan      f
4  4 shivgami      f
2  1      jau      m
3  3   chrisa      m
5  5      ram      m

Notice that the dataframe is sorted based on gender first and for same geneder records it sorts on id column.

3. Sort R DataFrame Rows by Multiple Columns using with()

with() is actually used to return the dataframe by applying some operation on it, the operation would be order(), this will take column names in which the dataframe is to be sorted in ascending order. with() accepts two parameters.


#Syntax
my_dataframe[with(my_dataframe, order(column1, column2,..........)), ]

Parameters:

  1. my_dataframe is the input dataframe
  2. order(columns) represents the column names

Let’s see an example using with() and order() to sort the R DataFrame.


#Sort the dataframe by gender and id columns
print(my_dataframe[with(my_dataframe, order(gender, id)), ] )

Output:


# Output
  id     name gender
1  2   sravan      f
4  4 shivgami      f
2  1      jau      m
3  3   chrisa      m
5  5      ram      m

4. Sort DataFrame Rows by Multiple Columns using arrange()

arrange() is used to sort the dataframe in ascending or descending order, By default it will sort in ascending order. It will take the dataframe name as the first parameter and column names as the next parameter. This method is available in the dplyr() package, so we need to load this package using library("dplyr").

Syntax:


# Syntax
arrange(my_dataframe, column1, column2,..........)

Parameters:

  1. my_dataframe is the input dataframe
  2. column... represents the column names

Example 1: Let’s see an example to sort dataframe in ascending & descending order using arrange() function.


#Load the dplyr package
library("dplyr") 

#Sort the dataframe by gender and id columns
print(arrange(my_dataframe, gender, id) )

Output:


# Output
  id     name gender
1  2   sravan      f
2  4 shivgami      f
3  1      jau      m
4  3   chrisa      m
5  5      ram      m

If you want to sort the R DataFrame in descending order, use the desc() method.

Syntax:


# Syntax
arrange(my_dataframe, desc(column1), desc(column2),..........)

In this example, we will sort the R dataframe rows by gender and id column in descending order.


#load the dplyr package
library("dplyr") 

#sort the dataframe by gender in descendig order and id columns
print(arrange(my_dataframe, desc(gender), desc(id)) )

Output:


# Output
  id     name gender
1  5      ram      m
2  3   chrisa      m
3  1      jau      m
4  4 shivgami      f
5  2   sravan      f

5. Sort R DataFrame Rows by Multiple Columns using setorder()

setorder() is used to order the R dataframe in ascending order. It will take the dataframe name as the first parameter and column names as the next parameters. This method is available in the data.table package, So we need to load this package.

Syntax:


# Syntax
setorder(my_dataframe, column1, column2,..........)

Parameters:

  1. my_dataframe is the input dataframe
  2. column… represents the column names

Example: Let’s see an example of using setorder() function.


#load the data.table package
library("data.table") 

#sort the dataframe by gender and id columns
print(setorder(my_dataframe, gender, id) )

Output:


  id     name gender
1  2   sravan      f
2  4 shivgami      f
3  1      jau      m
4  3   chrisa      m
5  5      ram      m

Conclusion

We have seen different methods to sort the R dataframe rows based on multiple columns. The first method we discussed is order() method. Next, we applied with() along with the order() method. setorder() available in data.table library which is also used to sort the dataframe rows by multiple columns. Finally, we have use arrange() to sort the dataframe by multiple columns in ascending order and using desc() to sort it in descending from the dplyr package.

Related Articles

R – Drop DataFrame Columns by Name

References

  1. order() in R
  2. arrange()
  3. setorder()
sort r dataframe rows

Leave a Reply

You are currently viewing R – Sort DataFrame Rows by Multiple Columns