You can extract month and year from the DateTime (date) column in pandas in several ways. In this article, I will explain how to get a year and get a month from the Datetime column using pandas.Series.dt.year
and pandas.Series.dt.month
methods respectively.
If the data is not in Datetime type, you need to convert it first to Datetime by using pd.to_datetime()
method. Also, I will cover extract year and month using pandas.DatetimeIndex properties year and month strftime()
methods.
1. Quick Examples of Extract Month and Year from Datetime
If you are in a hurry, below are some quick examples of how to extract month and year separately from the pandas DataFrame DateTime column.
# Use Datetime.strftime() Method to extract month and year
df['Year'] = df['InsertedDate'].dt.strftime('%Y')
df['Month'] = df['InsertedDate'].dt.strftime('%m')
# Using pandas.Series.dt.year() & pandas.Series.dt.month() method
df['Year'] = df['InsertedDate'].dt.year
df['Month'] = df['InsertedDate'].dt.month
# Using pandas.DatetimeIndex() to extract month and year
df['year'] = pd.DatetimeIndex(df['InsertedDate']).year
df['month'] = pd.DatetimeIndex(df['InsertedDate']).month
# Use datetime.to_period() method to extract month and year
df['Month_Year'] = df['InsertedDate'].dt.to_period('M')
# Use DataFrame.apply() with lambda function and strftime()
df['Month_Year'] = df['InsertedDate'].apply(lambda x: x.strftime('%B-%Y'))
# Use Pandas.to_datetime() and datetime.strftime() method
df['yyyy-mm'] = pd.to_datetime(df['InsertedDate']).dt.strftime('%Y-%m')
2. Pandas Extract Month and Year using Datetime.strftime()
strftime()
method takes datetime format and returns a string representing the specific format. You can use %Y
and %m
as format codes to extract year and month respectively from the pandas DataFrame.
import pandas as pd
import numpy as np
import datetime
Dates = ["2018-08-14","2019-10-17","2020-11-14","2020-05-17","2021-09-15","2021-12-14"]
Courses =["Spark","PySpark","Hadoop","Python","Pandas","Hadoop"]
df = pd.DataFrame({'InsertedDate': pd.to_datetime(Dates)},index=Courses)
# Use Datetime.strftime() Method to extract month and year
df['Year'] = df['InsertedDate'].dt.strftime('%Y')
df['Month'] = df['InsertedDate'].dt.strftime('%m')
print(df)
Yields below output.
InsertedDate Year Month
Spark 2018-08-14 2018 08
PySpark 2019-10-17 2019 10
Hadoop 2020-11-14 2020 11
Python 2020-05-17 2020 05
Pandas 2021-09-15 2021 09
Hadoop 2021-12-14 2021 12
3. Extract Month and Year Using pandas.Series.dt.year() & month()
You can use pandas.Series.dt.year()
and pandas.Series.dt.month()
methods to get year and month but, these return a series object. Assign these to a column to get a DataFrame with year and month columns.
# Using pandas.Series.dt.year() & pandas.Series.dt.month() method
df['Year'] = df['InsertedDate'].dt.year
df['Month'] = df['InsertedDate'].dt.month
print(df)
Yields below output.
InsertedDate Year Month
Spark 2018-08-14 2018 8
PySpark 2019-10-17 2019 10
Hadoop 2020-11-14 2020 11
Python 2020-05-17 2020 5
Pandas 2021-09-15 2021 9
Hadoop 2021-12-14 2021 12
4. Use pandas DatetimeIndex() to Extract Month and Year
Also, to extract the month and year from the pandas Datetime column, use DatetimeIndex.month
attribute to find the month
and use DatetimeIndex.year
attribute to find the year
present in the date. Note that this method takes a date as an argument.
# Using pandas.DatetimeIndex() to extract month and year
df['year'] = pd.DatetimeIndex(df['InsertedDate']).year
df['month'] = pd.DatetimeIndex(df['InsertedDate']).month
print(df)
Yields the same output as above.
5. Use Datetime.to_period() Method to Extract Month and Year
You can also use df['Month_Year']=df['InsertedDate'].dt.to_period('M')
method. The df['date_column']
has to be in datetime format.
# Use datetime.to_period() method to extract month and year
df['Month_Year'] = df['InsertedDate'].dt.to_period('M')
print(df)
Yields below output.
InsertedDate Month_Year
Spark 2018-08-14 2018-08
PySpark 2019-10-17 2019-10
Hadoop 2020-11-14 2020-11
Python 2020-05-17 2020-05
Pandas 2021-09-15 2021-09
Hadoop 2021-12-14 2021-12
6. Use DataFrame.apply() With Lambda Function and strftime()
Let’s see how to get the month and year by using Pandas DataFrame.apply() and lambda function.
# Use DataFrame.apply() with lambda function and strftime()
df['Month_Year'] = df['InsertedDate'].apply(lambda x: x.strftime('%B-%Y'))
print(df)
Yields below output.
InsertedDate Month_Year
Spark 2018-08-14 August-2018
PySpark 2019-10-17 October-2019
Hadoop 2020-11-14 November-2020
Python 2020-05-17 May-2020
Pandas 2021-09-15 September-2021
Hadoop 2021-12-14 December-2021
7. Use Pandas.to_datetime() and datetime.strftime() Method
To add a column with ‘year-month’ pairs
# Use Pandas.to_datetime() and datetime.strftime() method
df['yyyy-mm'] = pd.to_datetime(df['InsertedDate']).dt.strftime('%Y-%m')
print(df)
Yields below output.
InsertedDate yyyy-mm
Spark 2018-08-14 2018-08
PySpark 2019-10-17 2019-10
Hadoop 2020-11-14 2020-11
Python 2020-05-17 2020-05
Pandas 2021-09-15 2021-09
Hadoop 2021-12-14 2021-12
8. Complete Example to Get Month and Year from Panda Datetime
import pandas as pd
import numpy as np
import datetime
Dates = ["2018-08-14","2019-10-17","2020-11-14","2020-05-17","2021-09-15","2021-12-14"]
Courses =["Spark","PySpark","Hadoop","Python","Pandas","Hadoop"]
df = pd.DataFrame({'InsertedDate': pd.to_datetime(Dates)},index=Courses)
# Use Datetime.strftime() Method to extract month and year
df['Year'] = df['InsertedDate'].dt.strftime('%Y')
df['Month'] = df['InsertedDate'].dt.strftime('%m')
print(df)
# Using pandas.Series.dt.year() & pandas.Series.dt.month() method
df['Year'] = df['InsertedDate'].dt.year
df['Month'] = df['InsertedDate'].dt.month
print(df)
# Using pandas.DatetimeIndex() to extract month and year
df['year'] = pd.DatetimeIndex(df['InsertedDate']).year
df['month'] = pd.DatetimeIndex(df['InsertedDate']).month
print(df)
# Use datetime.to_period() method to extract month and year
df['Month_Year'] = df['InsertedDate'].dt.to_period('M')
print(df)
# Use DataFrame.apply() with lambda function and strftime()
df['Month_Year'] = df['InsertedDate'].apply(lambda x: x.strftime('%B-%Y'))
print(df)
# Use Pandas.to_datetime() and datetime.strftime() method
df['yyyy-mm'] = pd.to_datetime(df['InsertedDate']).dt.strftime('%Y-%m')
print(df)
# Use datetime.strftime() method
df['yyyy'] = pd.to_datetime(df['InsertedDate']).dt.strftime('%Y')
df['mm'] = pd.to_datetime(df['InsertedDate']).dt.strftime('%m')
print(df)
Conclusion
In this article, you have learned how to extract month and year separately from the pandas DateTime column by using pandas.Series.dt.strftime()
, pandas.DatetimeIndex()
, datetime.to_period()
and DataFrame.apply()
methods with examples.
Happy Learning !!
Related Articles
- Get Pandas DataFrame Columns by Data Type
- Count(Distinct) SQL Equivalent in Pandas DataFrame
- Pandas Convert Date (datetime) to String Format
- Select Pandas DataFrame Rows Between Two Dates
- Convert String Column To DateTime in Pandas
- Pandas Extract Year from Datetime
- Pandas Extract Column Value Based on Another Column
- Pandas Filter DataFrame by Multiple Conditions
- Get First N Rows of Pandas DataFrame