We can use the DataFrame.plot()
function to distribute column values in a Pandas DataFrame plot. It is an in-built function for data visualization. Using this function we can plot the given DataFrame in different ways. In this article, I will explain the plot()
and using this function how to distribute the column values of a given DataFrame in different plots.
Key Points –
- KDE Plot (Kernel Density Estimate) provides a smooth estimate of the distribution of a column’s values, useful for visualizing the probability density function.
- Histogram displays the frequency of data points in different bins, allowing for easy visualization of the distribution’s shape (e.g., normal, skewed).
plot()
method in Pandas allows for quick and flexible plotting of distributions, with options for different kinds of plots (e.g.,'kde'
,'hist'
).- Edge color in histograms (
edgecolor
parameter) is often used to differentiate bars visually, making plots clearer. - Grouping by categories (using
groupby()
) allows for the comparison of distributions across different groups or categories. - Adjusting plot appearance (e.g., title, axis labels, color, line style) enhances the readability and interpretability of the distribution plots.
Quick Examples of Plot Distribution of Column Values
Following are quick examples of plot distribution of column values in pandas.
# Quick examples of plot distribution of column values
# Example 1: Plot distribution of values in marks column
df['Marks].plot(kind='kde')
# Example 2: Plot distribution of values in marks column
# Using histogram
df['Marks'].plot(kind='hist', edgecolor='black')
# Example 3: Plot distribution of points by Students
df.groupby('Students')['Marks'].plot(kind='kde')
# Example 4: Plot distribution of points by Students
# Using histogram
df.groupby('Students')['Marks'].plot(kind='hist')
Let’s create Pandas DataFrame using Python Dictionary where the columns are ‘Students’ and ‘Marks’. Apply the df.plot()
function on DataFrame and distribute its column values on different types of visualization.
# Create Pandas DataFrame
import pandas as pd
import numpy as np
# Create DataFrame
df = pd.DataFrame({
'Students': ['Student1', 'Student1', 'Student1', 'Student2', 'Student2', 'Student1', 'Student1',
'Student1', 'Student2', 'Student2'],
'Marks' : [80.4, 50.6, 70.4, 50.2, 80.5, 70.4, 50.4, 60.4, 90.1, 90.5]
})
print("Create DataFrame:\n", df)
Yields below output.
# Output:
# Create DataFrame:
Students Marks
0 Student1 80.4
1 Student1 50.6
2 Student1 70.4
3 Student2 50.2
4 Student2 80.5
5 Student1 70.4
6 Student1 50.4
7 Student1 60.4
8 Student2 90.1
9 Student2 90.5
Plot Distribution of Column Values in Pandas
Using the df.plot()
function we can distribute the specific column values in the form of a specified plot. For that we need to set the kind
param as 'kde'
(kernel density estimation) and then, pass it into the plot()
function, it will distribute the column values in the form smooth curve.
# Plot distribution of values in Marks column
df['Marks].plot(kind='kde')
print(df)
Yields below output.
Plot Distribution of Columns in Pandas using Histogram
In Pandas one of the visualization plots is Histograms, which is used to represent the frequency distribution for numeric data. It divides the values within a numerical variable into bins and counts the values that have fallen into a bin. Plotting a histogram is a good way to explore the distribution of our data. This is useful when the DataFrames Series is on a similar scale.
Pass kind=’hist’
into the plot() function and distribute the column values of the given DataFrame in the form of a histogram plot. This plot uses bars to represent the distribution of values in the 'Marks'
column.
# Plot distribution of values in Marks column using histogram
df['Marks'].plot(kind='hist', edgecolor='black')
print(df)
Yields below output.
Plot Distribution of Column Values Grouped by Another Column
Using the df.plot()
function and df.groupby() function we can distribute one column values grouped by another column values. The following syntax will show a plot distribution of values in the 'Marks'
column, grouped by the 'Students'
column. We can add labels and title to the distribution plot using the plt.legend() function, and using the plt.xlabel()
function we can add the label of the x-axis. These functions are provided by the matplotlib library
.
import matplotlib.pyplot as plt
# Plot distribution of points by Students
df.groupby('Students')['Marks'].plot(kind='kde')
print(df)
# Add legend to plot
plt.legend(['Student1', 'student2'], title='Students')
# Add x-axis label
plt.xlabel('Marks')
Yields below output.
Plot Distribution of Column values Grouped by Another Column using Histogram
The following syntax will show a plot distribution of values in the 'Marks'
column, grouped by the 'Students'
column in the form of a histogram. For example,
# Plot distribution of points by Students using histogram
df.groupby('Students')['Marks'].plot(kind='hist')
print(df)
# Add legend to plot
plt.legend(['Student1', 'student2'], title='Students')
# Add x-axis label
plt.xlabel('Marks')
Yields below output.
FAQ on Plot Distribution of Column Values in Pandas
You can use the .plot()
function in Pandas along with kind='hist'
to create a histogram, which visualizes the distribution of a numeric column.
The bins
parameter controls how many bins are used to divide the range of the data. Adjust it to get a more granular or broader view of the distribution.
You can plot the distribution of multiple columns at once by selecting those columns and calling .plot(kind='hist')
.
You can use kind='kde'
to plot the density estimation, which is a smooth curve representing the data distribution.
You can customize your plots using various Matplotlib features such as changing colors, adding titles, labels, or even adjusting the plot style.
Conclusion
In this article, I have explained Pandas DataFrame plot()
and using this function how we can distribute the column values of a given Pandas DataFrame in different plots of visualization.
Related Articles
- How to Change Pandas Plot Size?
- How to Plot the Pandas Series?
- How to add title to Pandas plots?
- How to generate line plot in Pandas?
- How to add legends to plots in Pandas
- How to Plot a Histogram Using Pandas?
- How to Plot Columns of Pandas DataFrame
- How to Plot a Scatter Plot Using Pandas?
- How to Plot the Boxplot from DataFrame?
- How to make a histogram in Pandas Series?
- How to Generate Time Series Plot in Pandas
- Create Pandas Plot Bar Explained with Examples