• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:19 mins read
You are currently viewing Pandas Series.clip() Function
Pandas Series clip

In Pandas, the clip() function is used to limit the values in a Series within a specified range. It’s particularly useful when you want to cap or floor the values of a Series to certain minimum and maximum values.

In this article, I will explain the Series.clip() function and using its syntax, parameters, and usage how we can return a new Series with values clipped to the specified range, unless inplace=True is specified, in which case it modifies the existing Series in place and returns None.

Key Points –

  • The clip() function in Pandas is used to limit the values in a Series within a specified range.
  • It provides flexibility in handling outliers or extreme values by enabling you to cap or floor them without manually iterating over the Series.
  • Values below the lower bound are replaced with the lower bound, and values above the upper bound are replaced with the upper bound.
  • The inplace parameter can be used to operate in place, modifying the original Series, if set to True.
  • If inplace is not specified or set to False, the function returns a new Series with clipped values.

Series clip() Introduction

Following is the syntax of the pandas Series.clip() function.


# Syntax of Series.clip() function
Series.clip(lower=None, upper=None, axis=None, inplace=False, *args, **kwargs)

Parameters of the Series.clip()

Following are the parameters of the Series.clip() function.

  • lower – Scalar or array-like, optional. This parameter specifies the lower bound for the values in the Series. If a value in the Series is less than this lower bound, it will be replaced by the lower bound. If set to None, clipping is not performed on the lower end.
  • upper – Scalar or array-like, optional. This parameter specifies the upper bound for the values in the Series. If a value in the Series is greater than this upper bound, it will be replaced by the upper bound. If set to None, clipping is not performed on the upper end.
  • axis – It specifies the axis along which the clipping is applied. By default (None), the operation is applied to the entire Series. You can specify axis=0 or axis=index to clip along rows or axis=1 or axis=columns to clip along columns in a DataFrame.
  • inplace – This is a boolean parameter. If set to True, the operation is performed in place, modifying the original Series. If set to False (default), a new Series with clipped values is returned.
  • *args, **kwargs – Additional arguments that are passed to the method that performs the clipping.

Return Value

It returns a new Series object with the values clipped according to the specified lower and upper bounds. If the inplace parameter is set to True, the function will modify the original Series in place and return None.

Clipping Values Below a Lower Bound

Clipping values below a lower bound means setting any values in the series that are less than the specified lower bound to be equal to that lower bound.

Let’s create the Series using Python lists.


import pandas as pd

# Create a sample Series
series = pd.Series([-10, 20, 30, 40, 50])
print("Original Series:\n",series)

Yields below output.

pandas series clip

To clip values below a lower bound using the clip() function in Pandas, you can specify the lower bound parameter (lower) to which values below it will be set.


# Clip values below a lower bound
clipped_series = series.clip(lower=0)
print("Clipped Series (values below 0 clipped to 0):\n", clipped_series)

In the above example, the clip() function is used to set any values in the series series that are less than 0 to be equal to 0. As a result, the negative value -10 is replaced by 0, and all other values remain unchanged. This example yields the below output.

pandas series clip

Clipping Values Above an Upper Bound

Clipping values above an upper bound means setting any values in the Series that are greater than the specified upper bound to be equal to the upper bound.


# Clip values above an upper bound of 30
clipped_series = series.clip(upper=30)
print("Clipped Series (values above 30 clipped to 30):\n", clipped_series)

# Output:
# Clipped Series (values above 30 clipped to 30):
# 0   -10
# 1    20
# 2    30
# 3    30
# 4    30
# dtype: int64

In the above example, the values in the Series that are greater than 30 (40 and 50) have been clipped to 30, which is the specified upper bound.

Clipping Values with Different Lower and Upper Bounds

Alternatively, to clip values with different lower and upper bounds using the clip() function in Pandas, you can specify both the lower and upper parameters.


# Clip values within the range of 10 to 40
clipped_series = series.clip(lower=10, upper=40)
print("Clipped Series (values clipped to the range of 10 to 40):\n", clipped_series)

# Output:
# Clipped Series (values clipped to the range of 10 to 40):
# 0    10
# 1    20
# 2    30
# 3    40
# 4    40
# dtype: int64

In this example, any values below 10 in the original series have been clipped to 10, and any values above 40 have been clipped to 40 in the clipped series.

Clipping Values Inplace

You can perform inplace clipping by setting the inplace parameter to True when using the clip() function. For instance, the inplace=True parameter is used to perform the clipping operation directly on the original Series series. As a result, the original Series is modified inplace, and the values are clipped to fall within the range of 0 to 30.


# Clip values to be within the range of 0 to 30 inplace
series.clip(lower=0, upper=30, inplace=True)
print("Clipped Series (values clipped to the range of 0 to 30):\n", series)

# Output:
# Clipped Series (values clipped to the range of 0 to 30):
# 0     0
# 1    20
# 2    30
# 3    30
# 4    30
# dtype: int64

Clipping Values with NaNs

When clipping values in a Pandas Series that contains NaNs (missing values), the NaNs are preserved in the resulting Series. Here’s how you can clip values with NaNs using the clip() function.


import pandas as pd
import numpy as np

# Create a sample Series with NaNs
series = pd.Series([-10, 20, np.nan, 40, 50])

# Clip values to be within the range of 0 to 30, NaNs will remain unchanged
clipped_series = series.clip(lower=0, upper=30)
print("Clipped Series (values clipped to the range of 0 to 30):\n", clipped_series)

# Output:
# Clipped Series (values clipped to the range of 0 to 30):
# 0     0.0
# 1    20.0
# 2     NaN
# 3    30.0
# 4    30.0
# dtype: float64

In the above example, the NaN value in the original series remains unchanged after clipping. Any values below 0 are clipped to 0, and any values above 30 are clipped to 30 in the resulting series.

Clipping with NaN Replacement

Similarly, you can clip values in a Pandas Series while replacing NaNs (missing values) with a specified value using the clip() function along with the fillna() function


# Clip values to be within the range of 0 to 30 
# And replace NaNs with a value of -1
clipped_series = series.clip(lower=0, upper=30).fillna(-1)
print("Clipped Series (values clipped to the range of 0 to 30 with NaNs replaced):\n", clipped_series)

# Output:
# Clipped Series (values clipped to the range of 0 to 30 with NaNs replaced):
# 0     0.0
# 1    20.0
# 2    -1.0
# 3    30.0
# 4    30.0
# dtype: float64

In the above example, the clip() function is used to clip values to be within the range of 0 to 30. Then, the fillna() function is used to replace any NaNs with the specified value of -1. Finally, the clipped Series with NaNs replaced is printed.

Frequently Asked Questions on Pandas Series.clip() Function

What is the purpose of the clip() function in Pandas Series?

The clip() function in Pandas Series serves the purpose of limiting or “clipping” the values within a specified range. It allows you to cap or floor the values of a Series to certain minimum and maximum values. This function is particularly useful for data preprocessing tasks, handling outliers, or ensuring that values remain within a specific range for analysis or visualization purposes.

How does the clip() function handle values outside the specified range?

Values outside the specified range are replaced with the nearest bound. For example, if a value is below the lower bound, it is replaced with the lower bound; if it’s above the upper bound, it is replaced with the upper bound.

Can I use the clip() function to handle outliers in my data?

You can use the clip() function in Pandas to handle outliers in your data. Outliers are data points that significantly differ from the rest of the dataset and can skew statistical analysis or machine learning models. By capping or flooring the values of a Series using the clip() function, you can effectively mitigate the impact of outliers on your analysis.

How does the inplace parameter work in the clip() function?

The inplace parameter, when set to True, modifies the original Series in place and returns None. If set to False (default), it returns a new Series with clipped values.

Is it possible to clip values only above or below a certain threshold?

It is possible to clip values only above or below a certain threshold by setting either the lower or upper parameter in the clip() function.

How does the clip() function handle NaNs in the Series?

NaNs are preserved in the resulting Series when using the clip() function. They are not affected by the clipping operation unless explicitly handled through methods like fillna().

Conclusion

In this article, I have explained the clip() function in Pandas; It is used to limit the values in a Series to a specified range. It ensures that all values fall within the provided lower and upper bounds.

Happy Learning!!

Related Articles

References

Malli

Malli is an experienced technical writer with a passion for translating complex Python concepts into clear, concise, and user-friendly articles. Over the years, he has written hundreds of articles in Pandas, NumPy, Python, and takes pride in ability to bridge the gap between technical experts and end-users.