You can sort pandas DataFrame by one or multiple (one or more) columns using sort_values() method and by ascending or descending order. To specify the order, you have to use ascending
boolean property; False
for descending and True
for ascending. By default, it is set to True.
In this article, I will explain how to sort pandas DataFrame with one or multiple columns. By default sort_values() return a copy DataFrame with the result of the sort. To sort on current DataFrame use inplace=True
.
1. Quick Examples of Sort by Multiple Columns in pandas
If you are in a hurry, below are some quick examples of how to sort by multiple columns in pandas DataFrame.
# Below are some quick examples
# Sort multiple columns
df2 = df.sort_values(['Fee', 'Duration'],
ascending = [False, True])
# Sort by two columns
df2 = df.sort_values(['Courses', 'Discount'],
ascending = [True, True])
# Using the sorting function
df.sort_values(["Fee", "Courses"],
axis = 0, ascending = True,
inplace = True,
na_position = "first")
Let’s create a DataFrame with a few rows and columns and execute some examples to learn how sort works.
# Create DataFrame
import pandas as pd
technologies = ({
'Courses':["Spark","Hadoop","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,20000],
'Duration':['30days','35days','40days','50days','60days'],
'Discount':[1000,2300,1500,1200,2500]
})
df = pd.DataFrame(technologies, index = ['r1','r2','r3','r4','r0'])
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r2 Hadoop 25000 35days 2300
r3 pandas 26000 40days 1500
r4 Oracle 22000 50days 1200
r0 Java 20000 60days 2500
2. Sort Multiple Columns in pandas DataFrame
By using the sort_values() method you can sort multiple columns in DataFrame by ascending or descending order. When not specified order, all columns specified are sorted by ascending order.
# Sort multiple columns
df2 = df.sort_values(['Fee', 'Discount'])
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r0 Java 20000 60days 2500
r4 Oracle 22000 50days 1200
r2 Hadoop 25000 35days 2300
r3 pandas 26000 40days 1500
In case if you wanted to update the existing DataFrame use inplace=True
.
# Sort ascending order
df.sort_values(by=['Fee','Discount'], inplace=True)
print(df)
Yields same output as above.
3. Sort in an Ascending Order
Use ascending
param to sort the DataFrame in ascending or descending order. When you have multiple sorting columns. By default, it sorts in ascending order.
# Sort ascending order
df.sort_values(by=['Fee','Discount'], inplace=True,
ascending = [True, True])
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r0 Java 20000 60days 2500
r4 Oracle 22000 50days 1200
r2 Hadoop 25000 35days 2300
r3 pandas 26000 40days 1500
4. Sort Multiple Columns in Descending Order
In case you wanted to sort by descending order, use ascending=False
. You can also specify different sorting orders for each input.
# Sort descending order
df.sort_values(by=['Fee','Discount'], inplace=True,
ascending = [True, False])
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
r0 Java 20000 60days 2500
r1 Spark 20000 30days 1000
r4 Oracle 22000 50days 1200
r2 Hadoop 25000 35days 2300
r3 pandas 26000 40days 1500
5. Conclusion
In this article, you have learned how to sort a DataFrame by multiple columns using Dataframe.sort_values() by ascending or descending order.
Happy Learning !!
Related Articles
- How to Change Position of a Column in Pandas
- Change the Order of DataFrame Columns in Pandas
- Convert Float to Integer in Pandas DataFrame
- Replace NaN with Blank/Empty String in Pandas
- Pandas groupby() Explained With Examples
- Pandas Filter DataFrame by Multiple Conditions
- Apply Multiple Filters to Pandas DataFrame or Series
- How to Create Pandas Pivot Multiple Columns
- Pandas Merge Multiple DataFrames