• Post author:
  • Post category:Pandas
  • Post last modified:May 27, 2024
  • Reading time:8 mins read
You are currently viewing pandas DataFrame.sort_index() – Sort by Index

In Pandas, the DataFrame.sort_index() method is used to sort a DataFrame by its index. This method rearranges the rows of the DataFrame based on the index values. By default, it sorts the DataFrame in ascending order of index values, but you can specify the ascending parameter to change the sorting order.

Advertisements

In this article, I will explain the DataFrame.sort_index() function, its syntax, parameters, and usage of how to sort the pandas DataFrame by index or columns by name/labels. This function accepts various parameters such as axis, level, ascending, inplace, kind, na_position, sort_remaining, ignore_index, and key. It produces a new DataFrame with the sorted outcome. Setting inplace=True will modify the existing DataFrame instead.

Key Points –

  • Applied soring on axis, not on data.
  • Use axis=1 to sort by column names, axis-0 to sort by index.
  • Supports different sorting algorithms ‘quicksort’, ‘mergesort’, ‘heapsort’, ‘stable’

Pandas DataFrame.sort_index() Introduction

Following is the syntax of pandas.DataFrame.sort_index()


# Syntax of sort_index() function
DataFrame.sort_index(axis=0, level=None, ascending=True, inplace=False, kind='quicksort', na_position='last', sort_remaining=True, ignore_index=False, key=None)
  • axis – {0 or ‘index’, 1 or ‘columns’}, default 0. The axis along which to sort.
  • level – int or level name or list of ints or list of level names, optional. If the axis is a MultiIndex (hierarchical), sort by this level.
  • ascending – bool or list of bool, default True. Sort ascending vs. descending. Specify a list of bools to apply different sorting orders to different columns.
  • inplace – bool, default False. If True, perform operation in-place.
  • kind – {‘quicksort’, ‘mergesort’, ‘heapsort’}, default ‘quicksort’. Choice of sorting algorithm.
  • na_position – {‘first’, ‘last’}, default ‘last’. Determine placement of NA/null values.
  • sort_remaining – bool, default True. If True and sorting by level and index is lexicographically sorted, sort by remaining levels and index values.
  • ignore_index – bool, default False. If True, the resulting axis will be labeled 0, 1, …, n – 1.
  • key – callable or None, optional. Apply the key function to the index values before sorting.

Let’s understand these parameters by running some examples. let’s create a Pandas DataFrame using data from a dictionary.


# Create a pandas DataFrame.
import pandas as pd
import numpy as np
technologies = ({
    'Courses':["Spark",np.nan,"pandas","Java","Spark"],
    'Fee' :[20000,25000,30000,22000,26000],
    'Duration':['30days','40days','35days','60days','50days'],
    'Discount':[1000,2500,1500,1200,3000]
               })
df = pd.DataFrame(technologies, index = [101,123,115,340,100])
print(df)

Pandas Sort by Index

pandas sort_index() function by default sort DataFrame rows by index in ascending order. This by default returns a new DataFrame after sorting. Use inplace=True to update on existing DataFrame in place and return a None.


# Default sort by index labels
df2 = df.sort_index()
print(df2)

Yields below output.


# Output:
    Courses    Fee Duration  Discount
100   Spark  26000   50days      3000
101   Spark  20000   30days      1000
115  pandas  30000   35days      1500
123     NaN  25000   40days      2500
340    Java  22000   60days      1200

Sort by Descending Order

By default sort happens by ascending order, to change the order to descending use ascending=False param.


# Sort by Descending order
df2 = df.sort_index(ascending=False)
print(df2)

Yields below output.


# Output:
    Courses    Fee Duration  Discount
340    Java  22000   60days      1200
123     NaN  25000   40days      2500
115  pandas  30000   35days      1500
101   Spark  20000   30days      1000
100   Spark  26000   50days      3000

Reset Index on Sorted Result

Sometimes after sorting you may be required to reset the index starting from zero. In order to reset the index use ignore_index=True. This ignores the index and creates a new one.


# Sort ignoring index
df2 = df.sort_index(ignore_index=True)
print(df2)

Yields below output.


# Output:
  Courses    Fee Duration  Discount
0   Spark  26000   50days      3000
1   Spark  20000   30days      1000
2  pandas  30000   35days      1500
3     NaN  25000   40days      2500
4    Java  22000   60days      1200

Sort by Column Names/Labels

By default, sorting happens on index labels, Use axis=1 to change this and sort on columns by name in pandas DataFrame.


# Sort by column names
df2 = df.sort_index(axis=1)
print(df2)

Yields below output.


# Output:
    Courses  Discount Duration    Fee
101   Spark      1000   30days  20000
123     NaN      2500   40days  25000
115  pandas      1500   35days  30000
340    Java      1200   60days  22000
100   Spark      3000   50days  26000

Conclusion

In this article, you have learned the syntax of the sort_index() method, sorting rows by index and sorting DataFrame by column names.

Happy Learning !!

References