Pandas – Extract Month and Year Separately From Datetime Column

You can extract month and year separately from the pandas DateTime column in several ways. In this article, I will explain how to extract a year and extract a month from the Datetime column using pandas.Series.dt.year and pandas.Series.dt.month methods respectively. If the data is not in Datetime type, you need to convert it first to Datetime by using pd.to_datetime() method. Also, I will cover extract year and month using pandas.DatetimeIndex.month along with pandas.DatetimeIndex.year and strftime() methods.

1. Quick Examples of Extract Month and Year Separately From Datetime Column

If you are in a hurry, below are some quick examples of how to extract month and year separately from the pandas DataFrame DateTime column.


# Use Datetime.strftime() Method to extract month and year
df['Year'] = df['InsertedDate'].dt.strftime('%Y')
df['Month'] = df['InsertedDate'].dt.strftime('%m')

# Using pandas.Series.dt.year() & pandas.Series.dt.month() method
df['Year'] = df['InsertedDate'].dt.year 
df['Month'] = df['InsertedDate'].dt.month 

# Using pandas.DatetimeIndex() to extract month and year
df['year'] = pd.DatetimeIndex(df['InsertedDate']).year
df['month'] = pd.DatetimeIndex(df['InsertedDate']).month

# Use datetime.to_period() method to extract month and year
df['Month_Year'] = df['InsertedDate'].dt.to_period('M')

# Use DataFrame.apply() with lambda function and strftime()
df['Month_Year'] = df['InsertedDate'].apply(lambda x: x.strftime('%B-%Y')) 

# Use Pandas.to_datetime() and datetime.strftime() method
df['yyyy-mm'] = pd.to_datetime(df['InsertedDate']).dt.strftime('%Y-%m')

2. Use Datetime.strftime() Method to Extract Month and Year

strftime() method to take datetime format and returns a string representing the specific formate. You can use %Y and %m as format codes to extract year and month respectively.


import pandas as pd
import numpy as np
import datetime
Dates = ["2018-08-14","2019-10-17","2020-11-14","2020-05-17","2021-09-15","2021-12-14"]
Courses =["Spark","PySpark","Hadoop","Python","Pandas","Hadoop"]
df = pd.DataFrame({'InsertedDate': pd.to_datetime(Dates)},index=Courses)

# Use Datetime.strftime() Method to extract month and year
df['Year'] = df['InsertedDate'].dt.strftime('%Y')
df['Month'] = df['InsertedDate'].dt.strftime('%m')
print(df)

Yields below output.


        InsertedDate  Year Month
Spark     2018-08-14  2018    08
PySpark   2019-10-17  2019    10
Hadoop    2020-11-14  2020    11
Python    2020-05-17  2020    05
Pandas    2021-09-15  2021    09
Hadoop    2021-12-14  2021    12

3. Extract Month and Year Using pandas.Series.dt.year() & pandas.Series.dt.month() Method

You can use pandas.Series.dt.year() and pandas.Series.dt.month() methods to get year and month but, these return a series object. Assign these to a column to get a DataFrame with year and month columns.


# Using pandas.Series.dt.year() & pandas.Series.dt.month() method
df['Year'] = df['InsertedDate'].dt.year 
df['Month'] = df['InsertedDate'].dt.month 
print(df)

Yields below output.


        InsertedDate  Year  Month
Spark     2018-08-14  2018      8
PySpark   2019-10-17  2019     10
Hadoop    2020-11-14  2020     11
Python    2020-05-17  2020      5
Pandas    2021-09-15  2021      9
Hadoop    2021-12-14  2021     12

4. Use pandas.DatetimeIndex() to Extract Month and Year

Extract the month and year from the Datetime column by using DatetimeIndex.month attribute to find the month and use DatetimeIndex.year attribute to find the year present in the date. Note that this method takes a date as an argument.


# Using pandas.DatetimeIndex() to extract month and year
df['year'] = pd.DatetimeIndex(df['InsertedDate']).year
df['month'] = pd.DatetimeIndex(df['InsertedDate']).month
print(df)

Yields same output as above.

5. Use Datetime.to_period() Method to Extract Month and Year

You can also use df['Month_Year']=df['InsertedDate'].dt.to_period('M') method. The df['date_column'] has to be in datetime format.


# Use datetime.to_period() method to extract month and year
df['Month_Year'] = df['InsertedDate'].dt.to_period('M')
print(df)

Yields below output.


        InsertedDate   Month_Year
Spark     2018-08-14    2018-08
PySpark   2019-10-17    2019-10
Hadoop    2020-11-14    2020-11
Python    2020-05-17    2020-05
Pandas    2021-09-15    2021-09
Hadoop    2021-12-14    2021-12

6. Use DataFrame.apply() With Lambda Function and strftime()

Let’s see how to get by using DataFrame.apply() and lambda function.


# Use DataFrame.apply() with lambda function and strftime()
df['Month_Year'] = df['InsertedDate'].apply(lambda x: x.strftime('%B-%Y')) 
print(df)

Yields below output.


        InsertedDate      Month_Year
Spark     2018-08-14     August-2018
PySpark   2019-10-17    October-2019
Hadoop    2020-11-14   November-2020
Python    2020-05-17        May-2020
Pandas    2021-09-15  September-2021
Hadoop    2021-12-14   December-2021

7. Use Pandas.to_datetime() and datetime.strftime() Method

To add a column with ‘year-month’ pairs


# Use Pandas.to_datetime() and datetime.strftime() method
df['yyyy-mm'] = pd.to_datetime(df['InsertedDate']).dt.strftime('%Y-%m')
print(df)

Yields below output.


        InsertedDate  yyyy-mm
Spark     2018-08-14  2018-08
PySpark   2019-10-17  2019-10
Hadoop    2020-11-14  2020-11
Python    2020-05-17  2020-05
Pandas    2021-09-15  2021-09
Hadoop    2021-12-14  2021-12

8. Complete Example For Extract Month and Year Separately From Datetime Column


import pandas as pd
import numpy as np
import datetime
Dates = ["2018-08-14","2019-10-17","2020-11-14","2020-05-17","2021-09-15","2021-12-14"]
Courses =["Spark","PySpark","Hadoop","Python","Pandas","Hadoop"]
df = pd.DataFrame({'InsertedDate': pd.to_datetime(Dates)},index=Courses)

# Use Datetime.strftime() Method to extract month and year
df['Year'] = df['InsertedDate'].dt.strftime('%Y')
df['Month'] = df['InsertedDate'].dt.strftime('%m')
print(df)

# Using pandas.Series.dt.year() & pandas.Series.dt.month() method
df['Year'] = df['InsertedDate'].dt.year 
df['Month'] = df['InsertedDate'].dt.month 
print(df)

# Using pandas.DatetimeIndex() to extract month and year
df['year'] = pd.DatetimeIndex(df['InsertedDate']).year
df['month'] = pd.DatetimeIndex(df['InsertedDate']).month
print(df)

# Use datetime.to_period() method to extract month and year
df['Month_Year'] = df['InsertedDate'].dt.to_period('M')
print(df)

# Use DataFrame.apply() with lambda function and strftime()
df['Month_Year'] = df['InsertedDate'].apply(lambda x: x.strftime('%B-%Y')) 
print(df)

# Use Pandas.to_datetime() and datetime.strftime() method
df['yyyy-mm'] = pd.to_datetime(df['InsertedDate']).dt.strftime('%Y-%m')
print(df)

# Use datetime.strftime() method
df['yyyy'] = pd.to_datetime(df['InsertedDate']).dt.strftime('%Y')
df['mm'] = pd.to_datetime(df['InsertedDate']).dt.strftime('%m')
print(df)

Conclusion

In this article, you have learned how to extract month and year separately from pandas DateTime column by using pandas.Series.dt.strftime(), pandas.DatetimeIndex(), datetime.to_period() and DataFrame.apply() methods with examples.

Happy Learning !!

You May Also Like

References

pandas datetime month year

Leave a Reply

You are currently viewing Pandas – Extract Month and Year Separately From Datetime Column