pandas DataFrame.sort_index() function is used to sort the pandas DataFrame by index or columns by name/labels. This function takes several parameters like axis
, level
, ascending
, inplace
, kind
, na_position
, sort_remaining
, ignore_index
, and key
and returns a new DataFrame with the sorted result. Use inplace=True
to update the existing DataFrame.
sort_index() key Points
- Applied soring on axis, not on data.
- Use axis=1 to sort by column names, axis-0 to sort by index.
- Supports different sorting algorithms ‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’
Here, I will explain the syntax, usage, and explanation with examples of sort_index() method.
1. Syntax of DataFrame.sort_index()
Following is the syntax of pandas.DataFrame.sort_index()
# Syntax of sort_index() function
DataFrame.sort_index(axis=0, level=None, ascending=True, inplace=False, kind='quicksort', na_position='last', sort_remaining=True, ignore_index=False, key=None)
axis
– Axis to be sorted,default set to 0. 0 or ‘index’ & 1 or ‘columns’level
– f not None, sort on values in specified index level(s)ascending
– bool or list of bool. Specify to do sort by ascending or descending order. Default ascending.inplace
– If True, updates existing DataFrame. Default set to False.kind
– Sorting alorithm to choose from {‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’}, default ‘quicksort’na_position
– Specify where to keep the NaN’s. {‘first’, ‘last’}, Default set to ‘last’.sort_remaining
– If true and sorting by level and index is multilevel, sort by other levels too (in order) after sorting by specified levelignore_index
– Specify to reset the index starting from zero. Default set to false.key
– callable, optional
Let’s understand these parameters by running some examples. First, let’s create a DataFrame with a few rows and columns. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
.
# Create a pandas DataFrame.
import pandas as pd
import numpy as np
technologies = ({
'Courses':["Spark",np.nan,"pandas","Java","Spark"],
'Fee' :[20000,25000,30000,22000,26000],
'Duration':['30days','40days','35days','60days','50days'],
'Discount':[1000,2500,1500,1200,3000]
})
df = pd.DataFrame(technologies, index = [101,123,115,340,100])
print(df)
2. pandas Sort by Index
pandas sort_index()
function by default sort DataFrame rows by index in ascending order. This by default returns a new DataFrame after sorting. Use inplace=True
to update on existing DataFrame in place and returns a None.
# Default sort by index labels
df2 = df.sort_index()
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount
100 Spark 26000 50days 3000
101 Spark 20000 30days 1000
115 pandas 30000 35days 1500
123 NaN 25000 40days 2500
340 Java 22000 60days 1200
3. Sort by Descending Order
By default sort happens by ascending order, to change the order to descending use ascending=False
param.
# Sort by Descending order
df2 = df.sort_index(ascending=False)
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount
340 Java 22000 60days 1200
123 NaN 25000 40days 2500
115 pandas 30000 35days 1500
101 Spark 20000 30days 1000
100 Spark 26000 50days 3000
4. Reset Index on Sorted Result
Sometimes after sorting you may require to reset the index starting from zero. In order to reset the index use ignore_index=True
. This ignores the index and creates the new one.
# Sort ignoring index
df2 = df.sort_index(ignore_index=True)
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 26000 50days 3000
1 Spark 20000 30days 1000
2 pandas 30000 35days 1500
3 NaN 25000 40days 2500
4 Java 22000 60days 1200
5. Sort by Column Names/Labels
By default, soring happen on index labels, Use axis=1
to change this and sort on columns by name in pandas DataFrame.
# Sort by column names
df2 = df.sort_index(axis=1)
print(df2)
Yields below output.
# Output:
Courses Discount Duration Fee
101 Spark 1000 30days 20000
123 NaN 2500 40days 25000
115 pandas 1500 35days 30000
340 Java 1200 60days 22000
100 Spark 3000 50days 26000
Conclusion
In this article, you have learned the syntax of the sort_index() method, sorting rows by index and sorting DataFrame by column names.
Happy Learning !!