• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:24 mins read
You are currently viewing How to Generate Line Plot in a DataFrame?
Pandas line plot

Pandas DataFrame.plot() method is used to generate a line plot from the DataFrame. A line plot is the default plot. It Provides the plotting of one column to another column. If not specified, by default plotting is done over the index of the DataFrame to another numeric column.

Advertisements

In this article, I will explain the concept of a line plot and using plot() how to plot the line from the given Pandas DataFrame.

1. Quick Examples of Line Plot

If you are in a hurry, below are some quick examples of how to generate line plot in a DataFrame.


# Quick examples of line plot

# Example 1: Create a line plot 
seattle_temps['temp'].plot()

# Example 2: Default line plot
df.plot()

# Example 3: Get the single line plot
df['min'].plot()

# Example 4: Customize the Line plot of DataFrame
df.plot(rot = 60)
plt.xlabel("Index", size = 20)
plt.ylabel("Temp", size = 20)
plt.title("Minimum temperature of Seattle", size = 25)

# Example 5: Create multiple line on separate plots
df.plot(subplots = True)

# Example 6: Create timeseries plot
df.plot(x="date", y="min")
plt.xlabel("Date",  size = 20)
plt.ylabel("Minimum Temperature", size = 20)
plt.title("Minimum temperature of Seattle", size = 25)

2. Syntax of Pandas plot()

Following is the syntax of the plot() function which I will be using to create a time series plot.


# Syntax of plot()
DataFrame.plot(*args, **kwargs)

2.1 Parameters of plot() function

Following are the parameters of the plot() function.

  • data: Series or DataFrame.
  • x: label or position, default None. Only used if data is a DataFrame.
  • y: label, position or list of label, positions, default None. It allows the plotting of multiple columns. Only used if data is a DataFrame.
  • Kind: It defines the type of plot to be created, default value is line.

The kind of plot to produce:

  • line - line plot (default)
  • bar - vertical bar plot
  • barh - horizontal bar plot
  • hist - histogram
  • box - boxplot
  • kde - Kernel Density Estimation plot
  • density - same as ‘kde’
  • area - area plot
  • pie - pie plot
  • scatter - scatter plot (DataFrame only)
  • hexbin - hexbin plot (DataFrame only)
  • **kwargs: Options to pass to matplotlib plotting method.

2.2 Return Value

It returns matplotlib.axes.Axes or numpy.ndarray of them

3. Introduction of Plot.

Python Pandas library is mainly focused on data analysis and it is not only a data visualization library but also using this we can create basic plots. When we want to develop exploratory data analysis plots, pandas is highly useful and practical. It provides plot() and several other wrapper functions for visualizing our data.

Let’s use this pandas plot() function to create a time series plot. Here I have taken weather data of Seattle city from vega_datasets and using pandas I will plot the line plot of the given dataset.

To access these datasets from Python, you can use the Vega datasets python package. Let’s import weather data of Seattle city, Here columns are date and temp. The date column is in the form of yyyy-mm-dd.


# Import weather dataset
import pandas as pd
import numpy as np
from vega_datasets import data
import matplotlib.pyplot as plt

# Load seattle temperature data
seattle_temps = data.seattle_temps()
print(seattle_temps.shape)
print(seattle_temps.head())
print(seattle_temps.tail())

Yields below output.


# Shape() output:
(8759, 2)

# Head() output:
                 date  temp
0 2010-01-01 00:00:00  39.4
1 2010-01-01 01:00:00  39.2
2 2010-01-01 02:00:00  39.0
3 2010-01-01 03:00:00  38.9
4 2010-01-01 04:00:00  38.8

# Tail() output:
                    date  temp
8754 2010-12-31 19:00:00  40.7
8755 2010-12-31 20:00:00  40.5
8756 2010-12-31 21:00:00  40.2
8757 2010-12-31 22:00:00  40.0
8758 2010-12-31 23:00:00  39.6

4. Pandas plot() Function to Create Sample line Plot

We can directly pass temp column into plot() function to create a line plot by using the above specific column of Seattle’s weather data.


# Create a line plot 
seattle_temps['temp'].plot()

Yields below output.

Pandas line plot
Line plot of temperature

As you can see from the above, we have got a line plot with all the data, here band showing the minimum and maximum temperature for every data. For every hour the temperature data changes over a day. Also, we can observe indices of DataFrame on the x-axis, not the date column.

4.1 Extract Date from Datetime

Let’s remove the time part from datetime column.


# Convert date column as simple date
seattle_temps['date'] = seattle_temps['date'].dt.date
print(seattle_temps.tail())

Yields below output.


# Output:
            date  temp
8754  2010-12-31  40.7
8755  2010-12-31  40.5
8756  2010-12-31  40.2
8757  2010-12-31  40.0
8758  2010-12-31  39.6

Let’s also get minimum and maximum temperatures for each day using Pandas groupby() function along with pandas agg() function.


# Get the min & max temparatures
df = seattle_temps.groupby('date').agg(['min','max'])
print(df)

Yields below output.


# Output:
           temp      
             min   max
date                  
2010-01-01  38.6  43.5
2010-01-02  38.8  43.8
2010-01-03  39.0  44.0
2010-01-04  39.2  44.2
2010-01-05  39.3  44.4
         ...   ...
2010-12-27  37.9  42.8
2010-12-28  38.1  43.0
2010-12-29  38.1  43.0
2010-12-30  38.2  43.1
2010-12-31  38.4  43.3

[365 rows x 2 columns]

Using the pd.droplevel() function we can drop the multi-level column index, here I can drop the level 0 index of a given DataFrame to make a flattened Dataframe. Then, reset the index using reset_index() function.


# Drop the level 0 column & set the index
df.columns = df.columns.droplevel(0)
df.reset_index(level=0, inplace=True)
print(df.head())

Yields below output.


# Output:
         date   min   max
0  2010-01-01  38.6  43.5
1  2010-01-02  38.8  43.8
2  2010-01-03  39.0  44.0
3  2010-01-04  39.2  44.2
4  2010-01-05  39.3  44.4

5. Default Line Plot using DataFrame

Here, I will create a line plot of the given DataFrame using plot() function, it will take default indices on the x-axis and min and max columns on the y-axis. Finally, it will return the double-line plot.


# Default line plot
df.plot()

Yields below output.

Pandas line plot
Line plot of Pandas DataFrame

6. Make a Single Line plot

By using the above-created dataframe let’s plot the min temperature across different days.


# Get the single line plot
df['min'].plot()

Yields below output.

pandas line plot
Minimum temperature of Line Plot with Pandas

7. Customize the Line Plots

We can customize the plots using any keyword arguments pass into plot() function. rot keyword allows rotating the markings on the x-axis for horizontal plotting and y-axis for vertical plotting, size keyword allows to set the font size for the labels of axis points and title of the plots, and colormap keyword argument allows to choose different color sets for the plots.

Using Matplotlib.pyplot we can give the labels of the axes and the title of the plot. For example, Here, I use the rot keyword into plot() function, it will rotate the marking of the x-axis horizontally.


# Customize the Line plot of DataFrame
df.plot(rot = 60)
plt.xlabel("Index", size = 20)
plt.ylabel("Temp", size = 20)
plt.title("Minimum temperature of Seattle", size = 25)
pandas line plot
Line Plot of temperature using Pandas

8. Plot Multiple Lines on Separate Plots

We can create multiple lines on separate plots using plot() function. For that, we will set and pass the keyword argument argument subplots = True into this function, it will create multiple lines on separate plots.


# Create multiple line on separate plots
df.plot(subplots = True)
Pandas line plot
Multiple line plots

9. Create Timeseries plot in Pandas

Let’s create timeseries plot with minimum temperature on y-axis and date on x-axis using plot() function directly on the DataFrame.


# Create timeseries plot
df.plot(x="date", y="min")
plt.xlabel("Date",  size = 20)
plt.ylabel("Minimum Temperature", size = 20)
plt.title("Minimum temperature of Seattle", size = 25)
Pandas line plt=ot
Minimum temperature of timeseries Plot with Pandas

Frequently Asked Questions on Generate Line Plot in a DataFrame

How can I generate a line plot from a DataFrame using a specific column as the y-axis?

To generate a line plot from a DataFrame using a specific column as the y-axis, you can use the plot function provided by the pandas library. For example, the plot function is called on the ‘y_values’ column, and kind='line' specifies that you want to create a line plot. The resulting line plot will have the index (default x-axis) against the values in the ‘y_values’ column.

How do I customize the line plot, such as adding labels and a title?

You can customize a line plot in pandas by using additional parameters and functions provided by the matplotlib library. Here’s an example of how you can add labels and a title to your line plot.

Can I plot multiple lines on the same graph from different columns?

You can plot multiple lines on the same graph from different columns in a DataFrame. For example, df.plot is used with the x parameter set to the column containing x-values and the y parameter set to a list of columns for y-values. This creates a line plot with multiple lines, and plt.legend is used to add a legend with labels for each line.

How can I change the color or style of the line?

You can change the color or style of the line in a pandas line plot by using the color and style parameters in the plot function.

Can I create a line plot for multiple columns in the same DataFrame?

You can create a line plot for multiple columns in the same DataFrame by specifying the columns you want to plot in the y parameter of the plot function. For example, the y parameter is set to a list of column names ([‘y_values1’, ‘y_values2’]) to create a line plot for both columns on the same graph. The legend is added to differentiate between the lines.

How can I save the line plot as an image file?

You can save a line plot as an image file using the savefig function from the matplotlib.pyplot module. For example, plt.savefig('line_plot.png') saves the plot as a PNG image file in the current working directory. You can change the file extension to save the plot in different formats (e.g., ‘line_plot.jpg’, ‘line_plot.pdf’).

Conclusion

In this article, I have explained the concept of line plot and by using the plot() function how to plot the line plot or time series of DataFrame. Also explained how we can customize the line plot and timeseries using optional parameters.

Happy learning !!

References

Vijetha

Vijetha is an experienced technical writer with a strong command of various programming languages. She has had the opportunity to work extensively with a diverse range of technologies, including Python, Pandas, NumPy, and R. Throughout her career, Vijetha has consistently exhibited a remarkable ability to comprehend intricate technical details and adeptly translate them into accessible and understandable materials. Follow me at Linkedin.