• Post author:
  • Post category:Pandas
  • Post last modified:May 9, 2024
  • Reading time:19 mins read
You are currently viewing How to Generate Time Series Plot in Pandas

Pandas DataFrame.plot() method is used to generate a time series plot or line plot from the DataFrame. In time series data the values are measured at different points in time. Some of the time series are uniformly spaced at a specific frequency, for example, hourly temperature measurements, the daily volume of website visits, yearly counts of population, etc.

Advertisements

Time series can also be irregularly spaced, for example, events in a log file, or a history of 911 emergency calls. In this article, I will explain the generated time series plot() function and using its syntax, parameters, and usage how to plot the time series from the given panda DataFrame.

Key Points –

  • Import Pandas and any necessary plotting libraries, such as Matplotlib or Seaborn.
  • Load your time series data into a Pandas DataFrame, ensuring that the index is of type DateTime.
  • Ensure the time series data is properly formatted with a DateTime index in the Pandas DataFrame.
  • Utilize the plot() function directly on the DataFrame to generate the time series plot, while considering customization options such as plot type, labels, and styling.
  • Customize the plot as needed by specifying parameters such as title, labels, colors, and styles.

Syntax of plot()

Following is the syntax of the pandas series plot() method.


# Syntax of plot()
DataFrame.plot(*args, **kwargs)

Parameters of plot()

Following are the parameters of the plot() method.

  • data – Series or DataFrame.
  • x – Label or position to plot on the x-axis. It can be either a string indicating the column name or a position (e.g., column index or array of column indices).
  • yLabel or position to plot on the y-axis. Similar to the ‘x’ parameter, it can be a string indicating the column name or a position.
  • Kind – Type of plot to generate. It can take values like ‘line’, ‘bar’, ‘barh’, ‘hist’, ‘box’, ‘kde’, ‘density’, ‘area’, ‘pie’, ‘scatter’, etc.

The kind of plot to produce:

  • line - line plot (default).
  • bar - vertical bar plot.
  • barh - horizontal bar plot.
  • hist - histogram.
  • box - boxplot.
  • kde - Kernel Density Estimation plot.
  • density - same as ‘kde’.
  • area - area plot.
  • pie - pie plot.
  • scatter - scatter plot (DataFrame only).
  • hexbin - hexbin plot (DataFrame only).
  • **kwargs: Options to pass to matplotlib plotting method.

Return Value

The plot() function in Pandas returns a Matplotlib Axes object or an array of Axes objects, depending on the number of columns in the DataFrame or Series being plotted.

Usage of Plot() Function

The plot() function is a fundamental tool in data visualization libraries like Matplotlib (in Python) or MATLAB. It’s used to create 2D plots of arrays or lists of data. This creates more complex visualizations, which might be necessary depending on your data and presentation needs. Additionally, Matplotlib offers a wide range of functionalities for fine-tuning your plots, such as adding legends, annotations, and multiple subplots.

Let’s use this pandas plot() function to create a time series plot. Here I have taken weather data of Seattle city from vega_datasets and using pandas I will plot the line plot of the given dataset.


# Import weather dataset
import pandas as pd
import numpy as np
from vega_datasets import data
import matplotlib.pyplot as plt

# Load seattle temperature data
seattle_temps = data.seattle_temps()
print(seattle_temps.shape)
print(seattle_temps.head())
print(seattle_temps.tail())

Yields below output.


# Shape() output:
(8759, 2)

# Head() output:
                 date  temp
0 2010-01-01 00:00:00  39.4
1 2010-01-01 01:00:00  39.2
2 2010-01-01 02:00:00  39.0
3 2010-01-01 03:00:00  38.9
4 2010-01-01 04:00:00  38.8

# Tail() output:
                    date  temp
8754 2010-12-31 19:00:00  40.7
8755 2010-12-31 20:00:00  40.5
8756 2010-12-31 21:00:00  40.2
8757 2010-12-31 22:00:00  40.0
8758 2010-12-31 23:00:00  39.6

Create Sample line Plot

By using Seattle’s weather data, let’s make a simple plot using plot() function directly using the temp column.


# Create a line plot 
seattle_temps['temp'].plot()

Yields below output.

Line plot of temperature

As you can see from the above, you have got a line plot with all the data, here band showing the minimum and maximum temperature for every data. For every hour the temperature data changes over a day. Also, you can observe indices of DataFrame on the x-axis, not the date column.

Prepare Data with Time Series

Let’s set the date column as an index so that we can make line plots with a data point for each day. To do so, let’s remove the time part of the datetime column.


# Convert date column as simple date
seattle_temps['date'] = seattle_temps['date'].dt.date
print(seattle_temps.tail())

Yields below output.


# Output:
            date  temp
8754  2010-12-31  40.7
8755  2010-12-31  40.5
8756  2010-12-31  40.2
8757  2010-12-31  40.0
8758  2010-12-31  39.6

We can use the groupby() function along with the agg() function to get the minimum and maximum temperatures for each day.


# Get the min & max temparatures
df = seattle_temps.groupby('date').agg(['min','max'])
print(df)

Yields below output.


# Output:
           temp      
             min   max
date                  
2010-01-01  38.6  43.5
2010-01-02  38.8  43.8
2010-01-03  39.0  44.0
2010-01-04  39.2  44.2
2010-01-05  39.3  44.4
         ...   ...
2010-12-27  37.9  42.8
2010-12-28  38.1  43.0
2010-12-29  38.1  43.0
2010-12-30  38.2  43.1
2010-12-31  38.4  43.3

[365 rows x 2 columns]

Similarly, you can use the pd.droplevel() function to drop a level from a multi-level column index in a DataFrame. This can be followed by using the reset_index() function to reset the index and create a flattened DataFrame.


# Drop the level 0 column & set the index
df.columns = df.columns.droplevel(0)
df.reset_index(level=0, inplace=True)
print(df.head())

Yields below output.


# Output:
         date   min   max
0  2010-01-01  38.6  43.5
1  2010-01-02  38.8  43.8
2  2010-01-03  39.0  44.0
3  2010-01-04  39.2  44.2
4  2010-01-05  39.3  44.4

Make a Single Line plot

By using the above-created dataframe let’s plot the min temperature across different days.


# Get the single line plot
df['min'].plot()
pandas time series plot
Minimum temperature of Line Plot with Pandas

Create Timeseries plot in Pandas

Let’s create timeseries plot with minimum temperature on y-axis and date on x-axis using plot() function directly on the DataFrame. Using Matplotlib.pyplot we can give the labels of the axis and the title of the plot. For example,


# Create timeseries plot
df.plot(x="date", y="min")
plt.xlabel("Date",  size = 20)
plt.ylabel("Minimum Temperature", size = 20)
plt.title("Minimum temperature of Seattle", size = 25)
Minimum temperature of Line Plot with Pandas

Customize the Timeseries

We can customize the plots using any keyword arguments pass into plot() function. rot keyword allows rotating the markings on the x-axis for horizontal plotting and y-axis for vertical plotting, size keyword allows to set the font size for the labels of axis points and title of the plots, and colormap keyword argument allows to choose different color sets for the plots.

Here, I use the rot keyword into plot() function, it will rotate the marking of the x-axis horizontally.


# Customize the line plot
df.plot(x="date",y="min", rot = 60)
plt.xlabel("Date",size = 20)
plt.ylabel("Minimum temperature", size = 20)
plt.title("Minimum temperature of Seattle", size = 25)
Pandas timeseries plot
Minimum temperature of timeseries Plot with Pandas

Make a default Line Plot using DataFrame

Here, I will create a line plot of the given DataFrame using plot() function, it will take default indices on the x-axis and min and max columns on the y-axis. Finally, it will return the double-line plot.


# Line plot of DataFrame
df.plot()
plt.xlabel("Index", size = 20)
plt.ylabel("Temp", size = 20)
plt.title("Minimum temperature of Seattle", size = 25)
pandas time series plot
Line Plot of temperature using Pandas

Now, We can set the date column on the x-axis and make a time-series plot. For, that we need to reset the index of the data frame with our date variable and then, apply the plot() function it will return the time series of the given DataFrame.


# Timeseries plot of DataFrame
df.set_index('date').plot(rot=60)
plt.xlabel("Date", size = 20)
plt.ylabel("Temp", size = 20)
plt.title("Minimum temperature of Seattle", size = 25)
Pandas timeseries plot
Timeseries Plot of temperature using Pandas

FAQ on Generate Time Series Plot in Pandas

How do I generate a time series plot in Pandas?

To generate a time series plot in Pandas, you can use the plot function on a DataFrame with a datetime index.

How can I set the figure size and title for my time series plot?

To set the figure size and title for your time series plot in Pandas, you can use Matplotlib functions since Pandas plotting functions utilize Matplotlib underneath.

How can I plot multiple time series on the same plot?

To plot multiple time series on the same plot using Pandas, you can pass a list of column names to the y parameter in the plot function.

How can I add labels to the x-axis and y-axis?

To add labels to the x-axis and y-axis in a time series plot using Pandas, you can use the xlabel and ylabel functions from Matplotlib.

How can I save the time series plot to a file?

To save a time series plot generated in Pandas to a file, you can use the savefig function from Matplotlib. Replace 'your_plot.png' with the desired filename and extension. The file format (e.g., PNG, PDF, JPEG) is determined by the file extension.

Conclusion

In this article, I have explained the generate time series plot() function and using its syntax, parameters, and usage how to plot the time series DataFrame. Also explained how we can customize the time series plot and line plots.

Happy learning !!