Pandas DataFrame `boxplot()`

function is used to make a box plot from the given DataFrame columns. Boxplot is also called a Whisker plot that helps us better understand by providing the range of values in your data set and identifying any outliers in a format that’s easier to understand than the raw data.

In the boxplot graph, the x-axis represents the data we are going to plot and the y-axis represents frequency. In this article, I will explain how to plot the boxplot from DataFrame. The boxplot is also present in the Matplotlib library.

## 1. Quick Examples of Create Boxplot of DataFrame

If you are in a hurry below are some quick examples of how to create a box plot using boxplot().

```
# Quick examples of create boxplot of dataframe
# Create DataFrame
np.random.seed(10)
df = pd.DataFrame(np.random.rand(10, 3),
columns=['Num1', 'Num2', 'Num3' ])
# Example 1: Plot the box plot of single column of DataFrame
b_plot = df.boxplot(column = 'Num1')
b_plot.plot()
# Example 2: Create plot box for multiple columns
b_plot = df.boxplot(column = ['Num1', 'Num2', 'Num3'])
b_plot.plot()
# Example 3: Customize the boxplot color
b_plot = df.boxplot(column = 'Num1', color = 'orange' )
b_plot.plot()
# Example 4: Create the title of the boxplot
b_plot = df.boxplot(column = 'Num1')
plot.title('Random Numbers')
b_plot.plot()
# Example 5: Customize the font size of boxplot
b_plot = df.boxplot(column = 'Num1', fontsize = 15)
b_plot.plot()
```

## 2. Syntax of Pandas boxplot()

Following is the syntax of the `boxplot()`

.

```
# Syntax of boxplot()
DataFrame.boxplot(column=None, by=None, ax=None, fontsize=None, rot=0, grid=True, figsize=None, layout=None, return_type=None, backend=None, **kwargs)
```

### 2.1 Parameters of the boxplot()

Following are the parameters of the boxplot().

`column`

: ( string, list of string) Column name or names.`by`

: (string, array) Column in the DataFrame to group by.`ax`

: object of class matplot.axes.Axes – The matplot axis to be used by a boxplot.`fontsize`

: (int or float) The font size of the label.`rot`

: (int or float) The degree by which the labels should be rotated.`grid`

: (bool) Whether or not to show the grid.`figsize`

: tuple (width, height) The size of the output image.`**kwargs`

: tuple (rows, columns) All other plotting keyword arguments to be passed to matplotlib.pyplot.boxplot().

### 2.2 Return Value

When return_type is,

`axes`

: Returns the matplot axes that the boxplot is drawn on.`dict`

: Returns the dictionary that is in the matplotlib Lines of the boxplo.-
`axes and dict`

: Returns a named tuple with the axes and dict. `Grouping with by`

: A series mapping columns to return_type is returned.`None`

: A NumPy array of axes with the same shape as layout is returned.

## 3. Usage of boxplot()

Box plot is a popular method for visualizing numerical data in pandas, which can be created by computing the quartiles of a data set, which divides the number range into four pieces based on their distribution. Following is the basic information of the quartile.

`Median :`

Which is the value in the middle of the distribution.`Lower quartile :`

Midpoint between the median and lowest value in the range`Upper quartile :`

Midpoint between the median and highest value in the range`Lower boundary :`

Which is the lowest value in the distribution`Higher boundary :`

Which is the highest value in the distribution

## 4. Pandas Boxplot Single Column

We can visualize the given DataFrame in box plot chart by using `boxplot()`

function, it will return the summarization of the given data in the form of boxplot. Let’s create a Pandas DataFrame with columns of randomly generated numbers using np.random.rand() function. In order to stop the repeating random numbers for every run time execution, we have to feed the random `seed()`

function.

```
# Imports
import matplotlib.pyplot as plot
import pandas as pd
import numpy as np
# Create DataFrame
np.random.seed(10)
df = pd.DataFrame(np.random.rand(10, 3),
columns=['Num1', 'Num2', 'Num3' ])
print(df)
```

Yields below output.

By using the above DataFrame, plot the Boxplot on random numbers. In the boxplot, the bottom line indicates the minimum number of random numbers and the top line indicates the maximum number of random numbers. Between the bottom and top, the middle 3 lines indicate 1st quartile, median, and 3rd quartile respectively.

Let’s create a boxplot for a single column of a given DataFrame using `boxplot()`

function. It will generate a boxplot from the column of `'Num1'`

.

```
# Plot the box plot of single column of DataFrame
b_plot = df.boxplot(column = 'Num1')
b_plot.plot()
plot.show()
```

Yields below output.

## 5. Pandas Boxplot Multiple Columns

Let’s create a `boxplot()`

with multiple column names, for each column it creates a boxplot. It will generate multiple boxplots from the columns of `'Num1'`

, `'Num2'`

, `'Num3'`

. Boxplots are not limited to depicting single columns, A major use case for boxplots is to compare related distributions. For example,

```
# Create plot box for multiple columns
b_plot = df.boxplot(column = ['Num1', 'Num2', 'Num3'])
b_plot.plot()
plot.show()
```

Yields below output.

From the above, you can see the distributions of the random number for all columns of random numbers and how each column’s numbers compare with others. You can also notice that an `outlier`

in the “Num2” distribution, as denoted by the bubble outside the distribution.

## 6. Pandas Boxplot Customizations

The pandas library provides multiple keyword arguments for providing customization of boxplots. Let’s see some of them and how they work with boxplots.

### 6.1 Customize the Color of Boxplot

We can improve the boxplot distribution by providing customized colors for that we need to pass the color argument into `boxplot()`

, which will return the desired color of the boxplot.

```
# Customize the boxplot color
b_plot = df.boxplot(column = 'Num1', color = 'orange' )
b_plot.plot()
plot.show()
```

Yields below output.

### 6.2 Pandas Boxplot Title

By providing the `title`

to the boxplot, users can understand quickly what they are seeing. You can add a title to your boxplot by using the `title()`

function.

```
# Create the title of the boxplot
b_plot = df.boxplot(column = 'Num1')
plot.title('Random Numbers')
b_plot.plot()
plot.show
```

Yield below output.

### 6.3 Pandas Boxplot Label Font Size

We can change the default font size by providing a customized size. This can help the boxplot more clearly and easier to read. For that, we need to pass the `fontsize`

argument in to this function.

```
# Customize the font size of boxplot
b_plot = df.boxplot(column = 'Num1', fontsize = 15)
b_plot.plot()
plot.show
```

Yields below output.

## Frequently Asked Questions on Plot the Boxplot from DataFrame

**How do I install the required libraries (Matplotlib and Seaborn)?**To install the required libraries, Matplotlib and Seaborn, you can use the following commands in your terminal or command prompt.

**What is a boxplot, and what information does it provide?**

A boxplot (box-and-whisker plot) is a graphical representation that displays the distribution of a dataset. It shows the median, quartiles, and potential outliers. The box represents the interquartile range (IQR), the line inside the box is the median, and the whiskers extend to show the range of the data. Outliers may be shown as individual points beyond the whiskers.

**How can I customize the appearance of the boxplot?**Both Matplotlib and Seaborn offer a wide range of customization options. You can modify the title, labels, colors, and more. For example, you can use functions like `plt.title()`

, `plt.xlabel()`

, `plt.ylabel()`

(Matplotlib), or `sns.set_title()`

, `sns.set_xlabel()`

, `sns.set_ylabel()`

(Seaborn) to customize the plot.

**Can I plot a specific column from the DataFrame as a boxplot?**You can specify the column(s) you want to include in the boxplot. For example, if you have a DataFrame `df`

and want to plot only the ‘Category1’ column.

**How do I interpret outliers in a boxplot?**In a boxplot, outliers are individual points beyond the whiskers. They represent values that are significantly different from the majority of the data. Outliers can indicate potential errors in the data or interesting observations that merit further investigation.

**Can I combine boxplots for different categories side by side?**You can create side-by-side boxplots for different categories by using the `boxplot()`

function with the `by`

parameter.

## Conclusion

In this article, I have explained `boxplot()`

function and using this how we can plot the data in a DataFrame in the form boxplot presentation. And also I explained the organization of the boxplot using various keyword arguments.

Happy learning !!

## Related Articles

- How to Change Pandas Plot Size?
- How to add title to Pandas plots?
- How to generate line plot in Pandas?
- How to add legends to plots in Pandas
- How to change Plot size in pandas?
- How to Plot Columns of Pandas DataFrame
- How to generate histograms in Pandas?
- How to create Pandas Series plot?
- How to Plot a Scatter Plot Using Pandas?
- How to Generate Time Series Plot in Pandas?
- Create Pandas Plot Bar Explained with Example
- Create Pandas Plot Bar Explained with Example