We can sort Pandas DataFrame by date using sort_values()
function. Using this function we can sort single and multiple DateTime columns in ascending, descending, and even a variable order of a given DataFrame. In this article, I will explain the sort_values()
function and using this how we can sort the Datetime column by DateTime with examples.
Key Points –
- Use the
sort_values()
method in Pandas to sort a DataFrame by a DateTime column. - Specify the DateTime column to sort by using the
by
parameter. - Use the
ascending
parameter to control the sorting order (default is ascending). - Utilize the
inplace
parameter to apply sorting directly to the DataFrame. - Optionally, reset the index after sorting using the
reset_index()
method.
1. Quick Examples of Sort Pandas DataFrame by Date
If you are in a hurry, below are some quick examples of how to sort DataFrame by DateTime
# Quick examples of sort pandas dataframe by date
# Example 1: Sort DataFrame by date column
df.sort_values(by='Starting_dates', inplace = True)
# Example 2: Sort DataFrame by date column in descending order
df.sort_values(by='Starting_dates', ascending = False, inplace = True)
# Example 3: Sort multiple date columns
df.sort_values(by=['Starting_dates', 'ending_dates'], inplace = True)
# Example 4: Sort multiple date columns in descending order
df.sort_values(by=['Starting_dates', 'ending_dates'], ascending = False, inplace = True)
2. Sort Pandas DataFrame by Date
Following are the steps to sort a Pandas DataFrame by date or datetime column. Let’s follow the steps
The first step create a DataFrame with a few rows and columns, execute the above examples, and validate the results. Our DataFrame contains column names Courses
, Fee
, Duration
, Discount
and 'Starting_dates'
.
import pandas as pd
technologies = {
'Courses':["Spark", "Hadoop", "pandas", "Python", "PySpark"],
'Fee' :[20000,25000,30000, 22000, 24000],
'Duration':['30days', '40days', '35days', '40days', '50days'],
'Discount':[1000, 2500, 1500, 1200, 2300],
'Starting_dates': ["10/10/2022", "25/10/2022", "30/10/2022", "20/10/2022", "15/10/2022"]
}
df = pd.DataFrame(technologies)
print(df)
print(df.dtype)
Yields below output.
2.1 Convert String Data Type into Datetime Object
In order to sort Pandas DataFrame by its DateTime column, we need to make sure the date columns are in Datetime format. To use our DataFrame first we need to convert the string to datetime object.
Let’s convert the specified column of DataFrame an object
type to a datetime
type using to_datetime() function.
# Convert object type to datetime object
df['Starting_dates'] = pd.to_datetime(df['Starting_dates'])
print(df.dtypes)
Yields below output.
As we can see we have successfully converted our data type to datetime
.
Alternatively, we can convert the given column object type to string type by using pd.astype()
function.
# Convert object type to datetime object
df["Starting_dates"] = df["Starting_dates"].astype('datetime64[ns]')
print(df.dtypes)
The output is the same as above.
2.2 Sort Pandas DataFrame by Date
Using the Pandas sort_values()
function we can sort a given DataFrame by its DateTime column. For that, we need to set by
param as a datetime
column along with inplace = True
, by default it will sort the given DataFrame in ascending order by specified column and return the existing DataFrame with sorted values. inplace = True
is used to sort the DataFrame in place on the existing df
.
# Sort DataFrame by date column
df.sort_values(by='Starting_dates', inplace = True)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount Starting_dates
0 Spark 20000 30days 1000 10/10/2022
4 PySpark 24000 50days 2300 15/10/2022
3 Python 22000 40days 1200 20/10/2022
1 Hadoop 25000 40days 2500 25/10/2022
2 pandas 30000 35days 1500 30/10/2022
3. Sort Pandas by Date in Descending Order
Using the sort_values()
function we can sort the given DataFrame over the Datetime column in descending order. It can be done by setting ascending
param as False
and pass into the sort_values() function. It sorts the DataFrame in descending order over the datetime column.
# Sort DataFrame by date column in descending order
df.sort_values(by='Starting_dates', ascending = False, inplace = True)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount Starting_dates
2 pandas 30000 35days 1500 30/10/2022
1 Hadoop 25000 40days 2500 25/10/2022
3 Python 22000 40days 1200 20/10/2022
4 PySpark 24000 50days 2300 15/10/2022
0 Spark 20000 30days 1000 10/10/2022
4. Sort Pandas by Multiple Date Columns
Similarly, pass multiple columns to by param to sort pandas DataFrame by multiple columns by DateTime. Let’s add another date column to the given DataFrame
then, convert object type to DateTime object using <a href="https://sparkbyexamples.com/pandas/pandas-to_datetime-examples/">to_datetime()</a>
function.
# Add new date column
ending_dates = ["10/11/2022", "5/12/2022", "5/12/2022", "1/12/2022", "5/12/2022"]
# Add column to DataFrame
df['ending_dates'] = ending_dates
# Apply to_datetime() function to multiple columns
df[['Starting_dates', 'ending_dates']] = df[['Starting_dates', 'ending_dates']].apply(pd.to_datetime)
print(df.dtypes)
print(df)
Apply sort_values()
function on multiple date columns of a given DataFrame, by default it will sort both date columns in ascending order and return the existing DataFrame with sorted values over Date columns.
# Sort multiple date columns
df.sort_values(by=['Starting_dates', 'ending_dates'], inplace = True)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount Starting_dates ending_dates
0 Spark 20000 30days 1000 2022-10-10 2022-10-11
4 PySpark 24000 50days 2300 2022-10-15 2022-05-12
3 Python 22000 40days 1200 2022-10-20 2022-01-12
1 Hadoop 25000 40days 2500 2022-10-25 2022-05-12
2 pandas 30000 35days 1500 2022-10-30 2022-05-12
5. Sort Pandas by Multiple Date Columns in Descending Order
Set ascending param as False and pass it into this function along with specified multiple date columns. It sorts the given DataFrame in descending order by its date columns.
# Sort multiple date columns in descending order
df.sort_values(by=['Starting_dates', 'ending_dates'], ascending = False, inplace = True)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount Starting_dates ending_dates
2 pandas 30000 35days 1500 2022-10-30 2022-05-12
1 Hadoop 25000 40days 2500 2022-10-25 2022-05-12
3 Python 22000 40days 1200 2022-10-20 2022-01-12
4 PySpark 24000 50days 2300 2022-10-15 2022-05-12
0 Spark 20000 30days 1000 2022-10-10 2022-10-11
6. Conclusion
In this article, I have explained how to sort Pandas DataFrame by Date using the sort_values()
function. Also, learned sorting by single and multiple date columns, in ascending, descending, and even variable order with multiple examples. If your dates are not in date format use to_datatime()
to convert to DateTime.
Happy learning!!
Related Articles
- How to sort Pandas Series values
- How to sort groups of Pandas DataFrame?
- How to Sort Multiple Columns in pandas DataFrame
- Select Pandas DataFrame Rows Between Two Dates
- How to Format Pandas Datetime?
- Pandas DatetimeIndex Usage Explained
- Convert Pandas DatetimeIndex to String
- pandas Convert Datetime to Seconds
- Pandas Extract Year from Datetime
- Pandas Convert Date (datetime) to String Format
- pandas.DataFrame.sort_values() – Examples
- Pandas Sort by Column Values DataFrame
- pandas DataFrame.sort_index() – Sort by Index
- Pandas Groupby Sort within Groups
- How to Sort Multiple Columns in Pandas DataFrame
- Pandas Series.sort_values() With Examples