• Post author:
  • Post category:Pandas
  • Post last modified:May 26, 2024
  • Reading time:14 mins read
You are currently viewing Pandas Extract Year from Datetime

In Pandas, you can extract the year from a datetime column using the dt.year accessor. Before extracting the year, it’s advisable to convert data that is not initially in DateTime type using pd.to_datetime().

Advertisements

In this article, I will explain how to extract the year from the Datetime column using pandas.Series.dt.year, pandas.DatetimeIndex properties and strftime() functions.

Key Points –

  • Use the .dt.year accessor to extract the year from a DateTime column in Pandas.
  • Ensure the DateTime column is in the correct format using pd.to_datetime() if needed.
  • Pandas provides the .dt accessor for datetime series, allowing you to access various components like year, month, day, etc.
  • Utilize string slicing to extract the year portion from a datetime column.
  • Consider using regular expressions to extract year from datetime strings with varying formats.

Quick Examples of Extract Year from Datetime

Following are quick examples of extracting the year from the pandas DataFrame DateTime column.


# Quick examples of extract year from datetime

# Example 1: Use Datetime.strftime() method to extract year
df['Year'] = df['InsertedDate'].dt.strftime('%Y')

# Example 2: Using pandas.Series.dt.year()
df['Year'] = df['InsertedDate'].dt.year  

# Example 3: Using pandas.DatetimeIndex() to extract year
df['year'] = pd.DatetimeIndex(df['InsertedDate']).year

# Example 4: Use datetime.to_period() method to extract year
df['Month_Year'] = df['InsertedDate'].dt.to_period('y')

# Example 5: Use DataFrame.apply() with lambda function and strftime()
df['Year'] = df['InsertedDate'].apply(lambda x: x.strftime('%Y')) 

# Example 6: Use Pandas.to_datetime() and datetime.strftime() method
df['yyyy'] = pd.to_datetime(df['InsertedDate']).dt.strftime('%Y')

Pandas Extract Year using Datetime.strftime()

To run some examples of pandas extract year from Datetime, let’s create a Pandas DataFrame with the column of Datetime in the form of year, month, and day and use Pandas attributes and functions to extract the year from a given Datetime column.


import pandas as pd
import numpy as np
import datetime
Dates = ["2018-08-14","2019-10-17","2020-11-14","2020-05-17","2021-09-15","2021-12-14"]
Courses =["Spark","PySpark","Hadoop","Python","Pandas","Hadoop"]
df = pd.DataFrame({'InsertedDate': pd.to_datetime(Dates)},index=Courses)
print("Create DataFrame:\n", df)

This example extracts the year and add as a new column to DataFrame. This example yields the below output.

Pandas extract year

strftime() method takes the datetime format and returns a string representing the specific format. You can use %Y as format code to extract the year from the DataFrame. Here, pd.to_datetime() is used to convert String to Datetime.


# Use Datetime.strftime() method to extract year
df['Year'] = df['InsertedDate'].dt.strftime('%Y')
print("Get the year from the datetime column:\n", df)

In the above examples, dt.strftime('%Y') extract the year component from the InsertedDate’ column of the DataFrame df using the strftime() method with the format %Y, which represents the year with century as a decimal number. This example yields the below output.

Pandas extract year

Extract Year Using Series.dt.year()

We can use pandas.Series.dt.year() to extract year but, this function returns a series object. Assign these to a column to get a DataFrame with year columns.


# Using pandas.Series.dt.year()
df['Year'] = df['InsertedDate'].dt.year 
print("Get the year from the datetime column:\n", df)

Yields below output.


# Output:
# Get the year from the datetime column:
        InsertedDate  Year
Spark     2018-08-14  2018
PySpark   2019-10-17  2019
Hadoop    2020-11-14  2020
Python    2020-05-17  2020
Pandas    2021-09-15  2021
Hadoop    2021-12-14  2021

Use Pandas DatetimeIndex() to Extract Year

We can also extract the year from the Pandas Datetime column, using the DatetimeIndex.year attribute. Note that this method takes a date as an argument.


# Using pandas.DatetimeIndex() to extract year
df['year'] = pd.DatetimeIndex(df['InsertedDate']).year
print("Get the year from the datetime column:\n", df)

Yields the same output as above.

Use Datetime.to_period() Method to Extract Year

You can also use df['Year']=df['InsertedDate'].dt.to_period('Y') method. The df['date_column'] has to be in datetime format.


# Use datetime.to_period() method to year
df['Year'] = df['InsertedDate'].dt.to_period('Y')
print("Get the year from the datetime column:\n", df)

Yields below output.


# Output:
# Get the year from the datetime column:
        InsertedDate  Year
Spark     2018-08-14  2018
PySpark   2019-10-17  2019
Hadoop    2020-11-14  2020
Python    2020-05-17  2020
Pandas    2021-09-15  2021
Hadoop    2021-12-14  2021

Use DataFrame.apply() With Lambda Function and strftime()

You can utilize DataFrame.apply() with a lambda function and strftime() to extract the year from the DateTime column. Let’s see how to get the year by using Pandas DataFrame.apply() and lambda function.


# Use DataFrame.apply() with lambda function and strftime()
df['Year'] = df['InsertedDate'].apply(lambda x: x.strftime('%Y')) 
print("Get the year from the datetime column:\n", df)

Yields below output.


# Output:
# Get the year from the datetime column:
        InsertedDate  Year
Spark     2018-08-14  2018
PySpark   2019-10-17  2019
Hadoop    2020-11-14  2020
Python    2020-05-17  2020
Pandas    2021-09-15  2021
Hadoop    2021-12-14  2021

Use Pandas.to_datetime() and datetime.strftime() Method

You can use pd.to_datetime() and strftime() method from the datetime module to extract the year from a datetime column in a DataFrame.


# Use Pandas.to_datetime() and datetime.strftime() method
df['yyyy'] = pd.to_datetime(df['InsertedDate']).dt.strftime('%Y')
print("Get the year from the datetime column:\n", df)

Yields below output.


# Output:
# Get the year from the datetime column:
        InsertedDate  yyyy
Spark     2018-08-14  2018
PySpark   2019-10-17  2019
Hadoop    2020-11-14  2020
Python    2020-05-17  2020
Pandas    2021-09-15  2021
Hadoop    2021-12-14  2021

Frequently Asked Questions on Extract Year from Datetime

How can I extract the year from a datetime column in a Pandas DataFrame?

You can use the dt attribute in Pandas to extract the year from a datetime column. For example, df['year'] = df['datetime_column'].dt.year

How can I extract the year directly without creating a new column?

you can extract the year without creating a new column by simply accessing the dt.year attribute. For example, df['year'] = pd.to_datetime(df['datetime_column']).dt.year

How can I extract the year from a datetime index in a DataFrame?

If your DataFrame has a datetime index, you can use the year attribute directly on the index. For example, df.set_index('timestamp', inplace=True)
df['year'] = df.index.year

Conclusion

In conclusion, this article has covered several techniques for extracting the year from a Pandas DateTime column. By exploring methods such as pandas.Series.dt.strftime(), pandas.DatetimeIndex(), datetime.to_period(), and DataFrame.apply(), you now have a comprehensive understanding of how to perform this task efficiently.

Happy Learning !!

References