• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:12 mins read
You are currently viewing Sort Pandas DataFrame by Date (Datetime)

We can sort Pandas DataFrame by date using sort_values() function. Using this function we can sort single and multiple DateTime columns in ascending, descending, and even a variable order of a given DataFrame. In this article, I will explain the sort_values() function and using this how we can sort the Datetime column by DateTime with examples.

1. Quick Examples of Sort Pandas DataFrame by Date

If you are in a hurry, below are some quick examples of how to sort DataFrame by DateTime


# Below are some quick examples .

# Example 1: Sort DataFrame by date column
df.sort_values(by='Starting_dates', inplace = True) 

# Example 2: Sort DataFrame by date column in descending order
df.sort_values(by='Starting_dates', ascending = False, inplace = True) 

# Example 3: Sort multiple date columns
df.sort_values(by=['Starting_dates', 'ending_dates'], inplace = True)

# Example 4: Sort multiple date columns in descending order
df.sort_values(by=['Starting_dates', 'ending_dates'], ascending = False, inplace = True)

2. Sort Pandas DataFrame by Date

Following are the steps to sort a Pandas DataFrame by date or datetime column. Let’s follow the steps

The first step create a DataFrame with a few rows and columns, execute the above examples, and validate the results. Our DataFrame contains column names CoursesFeeDurationDiscount and 'Starting_dates'.


import pandas as pd
technologies = {
      'Courses':["Spark", "Hadoop", "pandas", "Python", "PySpark"],
     'Fee' :[20000,25000,30000, 22000, 24000],
     'Duration':['30days', '40days', '35days', '40days', '50days'],
     'Discount':[1000, 2500, 1500, 1200, 2300],
     'Starting_dates': ["10/10/2022",  "25/10/2022", "30/10/2022", "20/10/2022", "15/10/2022"]
     }
df = pd.DataFrame(technologies)
print(df)
print(df.dtype)

Yields below output.

pandas sort by date
Pandas DataFrame
pandas sort by datetime
Pandas DataFrame and its dtypes

2.1 Convert String Data Type into Datetime Object

In order to sort Pandas DataFrame by its DateTime column, we need to make sure the date columns are in Datetime format. To use our DataFrame first we need to convert the string to datetime object.

Let’s convert the specified column of DataFrame an object type to a datetime type using to_datetime() function.


# Convert object type to datetime object
df['Starting_dates'] = pd.to_datetime(df['Starting_dates'])
print(df.dtypes)

Yields below output.

dataframe sort by date
Convert string type to datetime

As we can see we have successfully converted our data type to datetime.

Alternatively, we can convert the given column object type to string type by using pd.astype() function.


# Convert object type to datetime object
df["Starting_dates"] = df["Starting_dates"].astype('datetime64[ns]')
print(df.dtypes)

The output is the same as above.

2.2 Sort Pandas DataFrame by Date

Using the Pandas sort_values() function we can sort a given DataFrame by its DateTime column. For that, we need to set by param as a datetime column along with inplace = True, by default it will sort the given DataFrame in ascending order by specified column and return the existing DataFrame with sorted values. inplace = True is used to sort the dataframe in place on the existing df.


# Sort DataFrame by date column
df.sort_values(by='Starting_dates', inplace = True)
print(df)

Yields below output.


# Output:
   Courses    Fee Duration  Discount Starting_dates
0    Spark  20000   30days      1000     10/10/2022
4  PySpark  24000   50days      2300     15/10/2022
3   Python  22000   40days      1200     20/10/2022
1   Hadoop  25000   40days      2500     25/10/2022
2   pandas  30000   35days      1500     30/10/2022

3. Sort Pandas by Date in Descending Order

Using the sort_values() function we can sort the given DataFrame over the Datetime column in descending order. It can be done by setting ascending param as False and pass into the sort_values() function. It sorts the DataFrame in descending order over the datetime column.


# Sort DataFrame by date column in descending order
df.sort_values(by='Starting_dates', ascending = False, inplace = True)
print(df)

Yields below output.


# Output:
   Courses    Fee Duration  Discount Starting_dates
2   pandas  30000   35days      1500     30/10/2022
1   Hadoop  25000   40days      2500     25/10/2022
3   Python  22000   40days      1200     20/10/2022
4  PySpark  24000   50days      2300     15/10/2022
0    Spark  20000   30days      1000     10/10/2022

4. Sort Pandas by Multiple Date Columns

Similarly, pass multiple columns to by param to sort pandas DataFrame by multiple columns by datetime. Let’s add another date column to the given DataFrame then, convert object type to datetime object using <a href="https://sparkbyexamples.com/pandas/pandas-to_datetime-examples/">to_datetime()</a> function.


# Add new date column
ending_dates = ["10/11/2022", "5/12/2022", "5/12/2022", "1/12/2022", "5/12/2022"]
# Add column to DataFrame
df['ending_dates'] = ending_dates

# Apply to_datetime() function to multiple columns 
df[['Starting_dates', 'ending_dates']] = df[['Starting_dates', 'ending_dates']].apply(pd.to_datetime)
print(df.dtypes)
print(df)

Apply sort_values() function on multiple date columns of a given DataFrame, by default it will sort both date columns in ascending order and return the existing DataFrame with sorted values over Date columns.


# Sort multiple date columns
df.sort_values(by=['Starting_dates', 'ending_dates'], inplace = True)
print(df)

Yields below output.


# Output:
   Courses    Fee Duration  Discount Starting_dates ending_dates
0    Spark  20000   30days      1000     2022-10-10   2022-10-11
4  PySpark  24000   50days      2300     2022-10-15   2022-05-12
3   Python  22000   40days      1200     2022-10-20   2022-01-12
1   Hadoop  25000   40days      2500     2022-10-25   2022-05-12
2   pandas  30000   35days      1500     2022-10-30   2022-05-12

5. Sort Pandas by Multiple Date Columns in Descending Order

Set ascending param as False and pass it into this function along with specified multiple date columns. It sorts the given DataFrame in descending order by its date columns.


# Sort multiple date columns in descending order
df.sort_values(by=['Starting_dates', 'ending_dates'], ascending = False, inplace = True)
print(df)

Yields below output.


# Output:
  Courses    Fee Duration  Discount Starting_dates ending_dates
2   pandas  30000   35days      1500     2022-10-30   2022-05-12
1   Hadoop  25000   40days      2500     2022-10-25   2022-05-12
3   Python  22000   40days      1200     2022-10-20   2022-01-12
4  PySpark  24000   50days      2300     2022-10-15   2022-05-12
0    Spark  20000   30days      1000     2022-10-10   2022-10-11

6. Conclusion

In this article, I have explained how to sort Pandas DataFrame by Date using the sort_values() function. Also, learned sorting by single and multiple date columns, in ascending, descending, and even variable order with multiple examples. If your dates are not in date format use to_datatime() to convert to DateTime.

Happy learning!!

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium