How to Sort Multiple Columns in pandas DataFrame

You can sort pandas DataFrame by one or multiple (one or more) columns using sort_values() method and by ascending or descending order. To specify the order, you have to use ascending boolean property; False for descending and True for ascending. By default, it is set to True.

In this article, I will explain how to sort pandas DataFrame with one or multiple columns. By default sort_values() return a copy DataFrame with the result of the sort. To sort on current DataFrame use inplace=True.

1. Quick Examples of Sort by Multiple Columns in pandas

If you are in a hurry, below are some quick examples of how to sort by multiple columns in pandas DataFrame.


# Below are some quick examples

# Sort multiple columns
df2 = df.sort_values(['Fee', 'Duration'],
              ascending = [False, True])

# Sort by two columns 
df2 = df.sort_values(['Courses', 'Discount'],
              ascending = [True, True])

# Using the sorting function
df.sort_values(["Fee", "Courses"],
               axis = 0, ascending = True,
               inplace = True,
               na_position = "first")

Let’s create a DataFrame with a few rows and columns and execute some examples to learn how sort works.


# Create DataFrame
import pandas as pd
technologies = ({
    'Courses':["Spark","Hadoop","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,20000],
    'Duration':['30days','35days','40days','50days','60days'],
    'Discount':[1000,2300,1500,1200,2500]
               })
df = pd.DataFrame(technologies, index = ['r1','r2','r3','r4','r0'])
print(df)

Yields below output.


# Output:
   Courses    Fee Duration  Discount
r1   Spark  20000   30days      1000
r2  Hadoop  25000   35days      2300
r3  pandas  26000   40days      1500
r4  Oracle  22000   50days      1200
r0    Java  20000   60days      2500

2. Sort Multiple Columns in pandas DataFrame

By using the sort_values() method you can sort multiple columns in DataFrame by ascending or descending order. When not specified order, all columns specified are sorted by ascending order.


# Sort multiple columns
df2 = df.sort_values(['Fee', 'Discount'])
print(df2)

Yields below output.


# Output:
   Courses    Fee Duration  Discount
r1   Spark  20000   30days      1000
r0    Java  20000   60days      2500
r4  Oracle  22000   50days      1200
r2  Hadoop  25000   35days      2300
r3  pandas  26000   40days      1500

In case if you wanted to update the existing DataFrame use inplace=True.


# Sort ascending order
df.sort_values(by=['Fee','Discount'], inplace=True)
print(df)

Yields same output as above.

3. Sort in an Ascending Order

Use ascending param to sort the DataFrame in ascending or descending order. When you have multiple sorting columns. By default, it sorts in ascending order.


# Sort ascending order
df.sort_values(by=['Fee','Discount'], inplace=True,
               ascending = [True, True])
print(df)

Yields below output.


# Output:
   Courses    Fee Duration  Discount
r1   Spark  20000   30days      1000
r0    Java  20000   60days      2500
r4  Oracle  22000   50days      1200
r2  Hadoop  25000   35days      2300
r3  pandas  26000   40days      1500

4. Sort Multiple Columns in Descending Order

In case you wanted to sort by descending order, use ascending=False. You can also specify different sorting orders for each input.


# Sort descending order
df.sort_values(by=['Fee','Discount'], inplace=True,
               ascending = [True, False])
print(df)

Yields below output.


# Output:
   Courses    Fee Duration  Discount
r0    Java  20000   60days      2500
r1   Spark  20000   30days      1000
r4  Oracle  22000   50days      1200
r2  Hadoop  25000   35days      2300
r3  pandas  26000   40days      1500

5. Conclusion

In this article, you have learned how to sort a DataFrame by multiple columns using Dataframe.sort_values() by ascending or descending order.

Happy Learning !!

Reference

Naveen (NNK)

Naveen (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ @ LinkedIn

Leave a Reply

You are currently viewing How to Sort Multiple Columns in pandas DataFrame