Pandas DataFrame.plot()
method is used to generate a time series plot or line plot from the DataFrame. In time series data the values are measured at different points in time. Some of the time series are uniformly spaced at a specific frequency, for example, hourly temperature measurements, the daily volume of website visits, yearly counts of population, etc.
Time series can also be irregularly spaced, for example, events in a log file, or a history of 911 emergency calls. In this article, I will explain the generated time series plot()
function and using its syntax, parameters, and usage how to plot the time series from the given panda DataFrame.
Key Points –
- Import Pandas and any necessary plotting libraries, such as Matplotlib or Seaborn.
- Load your time series data into a Pandas DataFrame, ensuring that the index is of type DateTime.
- Ensure the time series data is properly formatted with a DateTime index in the Pandas DataFrame.
- Utilize the
plot()
function directly on the DataFrame to generate the time series plot, while considering customization options such as plot type, labels, and styling. - Customize the plot as needed by specifying parameters such as title, labels, colors, and styles.
Syntax of plot()
Following is the syntax of the pandas series plot() method.
# Syntax of plot()
DataFrame.plot(*args, **kwargs)
Parameters of plot()
Following are the parameters of the plot() method.
data
– Series or DataFrame.x
– Label or position to plot on the x-axis. It can be either a string indicating the column name or a position (e.g., column index or array of column indices).y
– Label or position to plot on the y-axis. Similar to the ‘x’ parameter, it can be a string indicating the column name or a position.Kind
– Type of plot to generate. It can take values like ‘line’, ‘bar’, ‘barh’, ‘hist’, ‘box’, ‘kde’, ‘density’, ‘area’, ‘pie’, ‘scatter’, etc.
The kind of plot to produce:
line -
line plot (default).bar -
vertical bar plot.barh -
horizontal bar plot.hist -
histogram.box -
boxplot.kde -
Kernel Density Estimation plot.density -
same as ‘kde’.area -
area plot.pie -
pie plot.scatter -
scatter plot (DataFrame only).hexbin -
hexbin plot (DataFrame only).**kwargs:
Options to pass to matplotlib plotting method.
Return Value
The plot()
function in Pandas returns a Matplotlib Axes object or an array of Axes objects, depending on the number of columns in the DataFrame or Series being plotted.
Usage of Plot() Function
The plot()
function is a fundamental tool in data visualization libraries like Matplotlib (in Python) or MATLAB. It’s used to create 2D plots of arrays or lists of data. This creates more complex visualizations, which might be necessary depending on your data and presentation needs. Additionally, Matplotlib offers a wide range of functionalities for fine-tuning your plots, such as adding legends, annotations, and multiple subplots.
Let’s use this pandas plot()
function to create a time series plot. Here I have taken weather data of Seattle
city from vega_datasets
and using pandas I will plot the line plot of the given dataset.
# Import weather dataset
import pandas as pd
import numpy as np
from vega_datasets import data
import matplotlib.pyplot as plt
# Load seattle temperature data
seattle_temps = data.seattle_temps()
print(seattle_temps.shape)
print(seattle_temps.head())
print(seattle_temps.tail())
Yields below output.
# Shape() output:
(8759, 2)
# Head() output:
date temp
0 2010-01-01 00:00:00 39.4
1 2010-01-01 01:00:00 39.2
2 2010-01-01 02:00:00 39.0
3 2010-01-01 03:00:00 38.9
4 2010-01-01 04:00:00 38.8
# Tail() output:
date temp
8754 2010-12-31 19:00:00 40.7
8755 2010-12-31 20:00:00 40.5
8756 2010-12-31 21:00:00 40.2
8757 2010-12-31 22:00:00 40.0
8758 2010-12-31 23:00:00 39.6
Create Sample line Plot
By using Seattle’s weather data, let’s make a simple plot using plot()
function directly using the temp
column.
# Create a line plot
seattle_temps['temp'].plot()
Yields below output.
As you can see from the above, you have got a line plot with all the data, here band showing the minimum and maximum temperature for every data. For every hour the temperature data changes over a day. Also, you can observe indices of DataFrame on the x-axis, not the date column.
Prepare Data with Time Series
Let’s set the date column as an index so that we can make line plots with a data point for each day. To do so, let’s remove the time part of the datetime column.
# Convert date column as simple date
seattle_temps['date'] = seattle_temps['date'].dt.date
print(seattle_temps.tail())
Yields below output.
# Output:
date temp
8754 2010-12-31 40.7
8755 2010-12-31 40.5
8756 2010-12-31 40.2
8757 2010-12-31 40.0
8758 2010-12-31 39.6
We can use the groupby() function along with the agg() function to get the minimum and maximum temperatures for each day.
# Get the min & max temparatures
df = seattle_temps.groupby('date').agg(['min','max'])
print(df)
Yields below output.
# Output:
temp
min max
date
2010-01-01 38.6 43.5
2010-01-02 38.8 43.8
2010-01-03 39.0 44.0
2010-01-04 39.2 44.2
2010-01-05 39.3 44.4
... ...
2010-12-27 37.9 42.8
2010-12-28 38.1 43.0
2010-12-29 38.1 43.0
2010-12-30 38.2 43.1
2010-12-31 38.4 43.3
[365 rows x 2 columns]
Similarly, you can use the pd.droplevel() function to drop a level from a multi-level column index in a DataFrame. This can be followed by using the reset_index() function to reset the index and create a flattened DataFrame.
# Drop the level 0 column & set the index
df.columns = df.columns.droplevel(0)
df.reset_index(level=0, inplace=True)
print(df.head())
Yields below output.
# Output:
date min max
0 2010-01-01 38.6 43.5
1 2010-01-02 38.8 43.8
2 2010-01-03 39.0 44.0
3 2010-01-04 39.2 44.2
4 2010-01-05 39.3 44.4
Make a Single Line plot
By using the above-created dataframe let’s plot the min
temperature across different days.
# Get the single line plot
df['min'].plot()
Create Timeseries plot in Pandas
Let’s create timeseries plot with minimum temperature
on y-axis
and date
on x-axis
using plot()
function directly on the DataFrame. Using Matplotlib.pyplot
we can give the labels of the axis and the title of the plot. For example,
# Create timeseries plot
df.plot(x="date", y="min")
plt.xlabel("Date", size = 20)
plt.ylabel("Minimum Temperature", size = 20)
plt.title("Minimum temperature of Seattle", size = 25)
Customize the Timeseries
We can customize the plots using any keyword arguments pass into plot()
function. rot
keyword allows rotating the markings on the x-axis for horizontal plotting and y-axis for vertical plotting, size
keyword allows to set the font size for the labels of axis points and title of the plots, and colormap
keyword argument allows to choose different color sets for the plots.
Here, I use the rot
keyword into plot()
function, it will rotate the marking of the x-axis horizontally.
# Customize the line plot
df.plot(x="date",y="min", rot = 60)
plt.xlabel("Date",size = 20)
plt.ylabel("Minimum temperature", size = 20)
plt.title("Minimum temperature of Seattle", size = 25)
Make a default Line Plot using DataFrame
Here, I will create a line plot of the given DataFrame using plot()
function, it will take default indices on the x-axis and min and max columns on the y-axis. Finally, it will return the double-line plot.
# Line plot of DataFrame
df.plot()
plt.xlabel("Index", size = 20)
plt.ylabel("Temp", size = 20)
plt.title("Minimum temperature of Seattle", size = 25)
Now, We can set the date
column on the x-axis
and make a time-series plot. For, that we need to reset the index of the data frame with our date variable and then, apply the plot()
function it will return the time series of the given DataFrame.
# Timeseries plot of DataFrame
df.set_index('date').plot(rot=60)
plt.xlabel("Date", size = 20)
plt.ylabel("Temp", size = 20)
plt.title("Minimum temperature of Seattle", size = 25)
FAQ on Generate Time Series Plot in Pandas
To generate a time series plot in Pandas, you can use the plot
function on a DataFrame with a datetime index.
To set the figure size and title for your time series plot in Pandas, you can use Matplotlib functions since Pandas plotting functions utilize Matplotlib underneath.
To plot multiple time series on the same plot using Pandas, you can pass a list of column names to the y
parameter in the plot
function.
To add labels to the x-axis and y-axis in a time series plot using Pandas, you can use the xlabel
and ylabel
functions from Matplotlib.
To save a time series plot generated in Pandas to a file, you can use the savefig
function from Matplotlib. Replace 'your_plot.png'
with the desired filename and extension. The file format (e.g., PNG, PDF, JPEG) is determined by the file extension.
Conclusion
In this article, I have explained the generate time series plot()
function and using its syntax, parameters, and usage how to plot the time series DataFrame. Also explained how we can customize the time series plot and line plots.
Happy learning !!
Related Articles
- How to generate line plot in Pandas?
- Distribute column values in Pandas plot?
- How to Plot the Boxplot from DataFrame?
- How to Plot a Scatter Plot Using Pandas?
- Pandas Convert String Column To DateTime
- Create Pandas Plot Bar Explained with Examples
- Add Constant Column to Pandas DataFrame
- Sum Pandas DataFrame Columns With Examples
- Create Pandas DataFrame With Working Examples
- Select Pandas DataFrame Rows Between Two Dates