Site icon Spark By {Examples}

pandas DataFrame.sort_index() – Sort by Index

pandas sort index

pandas DataFrame.sort_index() function is used to sort the pandas DataFrame by index or columns by name/labels. This function takes several parameters like axis, level, ascending, inplace, kind, na_position, sort_remaining, ignore_index, and key and returns a new DataFrame with the sorted result. Use inplace=True to update the existing DataFrame.

sort_index() key Points

Here, I will explain the syntax, usage, and explanation with examples of sort_index() method.

1. Syntax of DataFrame.sort_index()

Following is the syntax of pandas.DataFrame.sort_index()


# Syntax of sort_index() function
DataFrame.sort_index(axis=0, level=None, ascending=True, inplace=False, kind='quicksort', na_position='last', sort_remaining=True, ignore_index=False, key=None)

Let’s understand these parameters by running some examples. First, let’s create a DataFrame with a few rows and columns. Our DataFrame contains column names Courses, Fee, Duration, and Discount.


# Create a pandas DataFrame.
import pandas as pd
import numpy as np
technologies = ({
    'Courses':["Spark",np.nan,"pandas","Java","Spark"],
    'Fee' :[20000,25000,30000,22000,26000],
    'Duration':['30days','40days','35days','60days','50days'],
    'Discount':[1000,2500,1500,1200,3000]
               })
df = pd.DataFrame(technologies, index = [101,123,115,340,100])
print(df)

2. pandas Sort by Index

pandas sort_index() function by default sort DataFrame rows by index in ascending order. This by default returns a new DataFrame after sorting. Use inplace=True to update on existing DataFrame in place and returns a None.


# Default sort by index labels
df2 = df.sort_index()
print(df2)

Yields below output.


# Output:
    Courses    Fee Duration  Discount
100   Spark  26000   50days      3000
101   Spark  20000   30days      1000
115  pandas  30000   35days      1500
123     NaN  25000   40days      2500
340    Java  22000   60days      1200

3. Sort by Descending Order

By default sort happens by ascending order, to change the order to descending use ascending=False param.


# Sort by Descending order
df2 = df.sort_index(ascending=False)
print(df2)

Yields below output.


# Output:
    Courses    Fee Duration  Discount
340    Java  22000   60days      1200
123     NaN  25000   40days      2500
115  pandas  30000   35days      1500
101   Spark  20000   30days      1000
100   Spark  26000   50days      3000

4. Reset Index on Sorted Result

Sometimes after sorting you may require to reset the index starting from zero. In order to reset the index use ignore_index=True. This ignores the index and creates the new one.


# Sort ignoring index
df2 = df.sort_index(ignore_index=True)
print(df2)

Yields below output.


# Output:
  Courses    Fee Duration  Discount
0   Spark  26000   50days      3000
1   Spark  20000   30days      1000
2  pandas  30000   35days      1500
3     NaN  25000   40days      2500
4    Java  22000   60days      1200

5. Sort by Column Names/Labels

By default, soring happen on index labels, Use axis=1 to change this and sort on columns by name in pandas DataFrame.


# Sort by column names
df2 = df.sort_index(axis=1)
print(df2)

Yields below output.


# Output:
    Courses  Discount Duration    Fee
101   Spark      1000   30days  20000
123     NaN      2500   40days  25000
115  pandas      1500   35days  30000
340    Java      1200   60days  22000
100   Spark      3000   50days  26000

Conclusion

In this article, you have learned the syntax of the sort_index() method, sorting rows by index and sorting DataFrame by column names.

Happy Learning !!

References

Exit mobile version