• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:19 mins read
You are currently viewing Pandas Series where() Function

In Pandas, the where() function is used to replace values with specified values where the condition is not satisfied. It is a convenient method for filtering data based on a condition. This function is used for the conditional replacement of values. It provides a flexible way to apply conditions to each element of a Series and replace values that do not meet the condition with a specified value or another Series.

Advertisements

In this article, I will explain the Pandas Series where() function and using its syntax, parameters, and usage; and explain how to replace the values with specified values within a Series based on certain conditions.

Key Points –

  • The Series.where() method in Pandas is used for conditional filtering of data within a Series. It allows you to retain the original values where a specified condition is met and replace values where the condition is not met with a specified substitute.
  • It helps in selectively retaining original values where the condition is true and replacing values where the condition is false with a specified substitute.
  • You can specify a scalar value, another Series, or a callable function as the replacement, offering a versatile way to customize the substitution based on your data and conditions.
  • By default, where() returns a new Series with the specified replacements, leaving the original Series unchanged. If you set the inplace parameter to True, the original Series is modified in place, and None is returned.
  • Similar to NumPy’s broadcasting, the where() method in Pandas that supports broadcasting, allowing you to use it with conditions and replacement values of different shapes. The operation is performed element-wise, and the shapes are aligned based on the broadcasting rules.

Series where() Introduction

Following is the syntax of series where().


# Syntax of series where()
Series.where(cond, other=nan, inplace=False, axis=None, level=None, errors='raise', try_cast=False)

Parameters of Series where()

Following are the parameters of the series where() function.

  • cond – This is the condition to be applied. The values for which the condition is False will be replaced with corresponding values from other.
  • other – The replacement values for elements where the condition is False. By default, it is set to nan (Not a Number).
  • inplace – If True, the operation will modify the Series in place and will return None. If False (default), it will return a new Series with the values modified.
  • axis – Not applicable for Series. It’s present for compatibility with DataFrame. Should be None.
  • level – If the axis is a MultiIndex, the level to use for alignment.
  • errors – Defines behavior when the condition contains errors. The default is ‘raise’, which raises an error. You can also use ‘coerce’ to set invalid elements to NaN.
  • try_cast – If True and the condition dtype is not the same as the dtype of the Series, try to cast the condition to the Series dtype.

Return value

It returns a new Series with updated values along with original values, which are dependent upon the condition.

Create Pandas Series

You can create a Pandas Series using a Python list & dictionaries, below example creates a Series from a list. To use Pandas first, you need to import using import pandas as pd.


# Create a Pandas Series
import pandas as pd
import numpy as np
data = pd.Series([1, 5, 10, 15, 20])

series = pd.Series(data)
print("Original Series:\n",series)

Yields below output.

pandas series where

Using Pandas where() to Replace Series Values with NaN

You can use the Pandas where() function to replace the values in a Series with NaN values where the condition is not satisfied. If the condition is satisfied the values remain unchanged. For example,


# Replace values with NaN using where()
result = data.where(data >= 10, np.nan)
print(result)

Here, the where method is applied to the data series, it replaces values where the condition is False with a specified value (np.nan in this case). The condition is data >= 10, so all values in the series that are less than 10 will be replaced with np.nan.

Yields the below output.

pandas series where

As you can see, values less than 10 in the original series are replaced with NaN in the result series.

Replace Values using where() with a Specific Value

To replace values based on a condition with a specific value using the where() function. For example, apply the where() function to a given Series then pass the specified condition along with the specified value into this function. it will replace the values where the condition becomes False; otherwise, it will retain the original values.


# Replace values with 100 using where()
result = data.where(data <= 15, 100)
print(result)

# Output:
# 0      1
# 1      5
# 2     10
# 3     15
# 4    100
# dtype: int64

From the above code, values in the data Series that are less than or equal to 15 will remain unchanged, while values greater than 15 will be replaced with 100.

Use where() with Multiple Conditions

You can replace values based on multiple conditions using the where() function, you can chain conditions together.


# Replace values based on multiple conditions
result = data.where((data >= 10) & (data <= 20), 0).where(data > 20, 100)
print(result)

# Output:
# 0    100
# 1    100
# 2    100
# 3    100
# 4    100
# dtype: int64

In the above example, the where() function is used to replace values less than 10 with 0 and values between 10 and 20 (inclusive) remain unchanged. This is achieved using the condition (data >= 10) & (data <= 20). If the condition is True, the original value is kept; otherwise, it is replaced with 0. The second where() function is then used to replace values greater than 20 with 100. If the condition data > 20 is True, the original value is kept; otherwise, it is replaced with 100. The final result is a new Series (result) with replaced values according to the specified conditions.

Pandas Series where() with Lambda function

Alternatively, you can use the where() function to replace values by using a lambda function. You can provide a callable (a function) as the other parameter.


# Replace values using where() and lambda
result = data.where(lambda x: x % 2 == 0, other=data**2)
print(result)

# Output:
# 0      1
# 1     25
# 2     10
# 3    225
# 4     20
# dtype: int64

In the above example, the lambda function is used to check if each value in the data Series is even (x % 2 == 0). If the condition is True, the original value is kept; otherwise, the value is replaced with its square (data**2). As a result, only the odd values are replaced with their squares.

Replace Values using Another Series

Similarly, you can replace values in a Pandas Series with another Series using where() function. First, create two Series and then apply the where() function to the first Series and pass another Series as a replacement. It will replace the values of the first Series with the corresponding values of the second Series where the condition becomes False; otherwise, the values of the first Series remain unchanged.


# Create Pandas Series
import pandas as pd
import numpy as np
data = pd.Series([1, 2, 3, 4, 5])
data1 = pd.Series([10, 20, 30, 40, 50])

# Replace values using another series
result = data.where(data < 3, data1)
print(result)

# Output:
# 0     1
# 1     2
# 2    30
# 3    40
# 4    50
# dtype: int64

In the above example, the where() function is used to replace values in data with the corresponding values in data1 where the condition data < 3 is True. If the condition is False, the values from data1 are used. As a result, values in data less than 3 are replaced with values from data1.

Frequently Asked Questions on Pandas Series where() Function

What is the purpose of the where() function in Pandas Series?

The where() function is used to replace values in a Series based on a specified condition. It allows for conditional replacement, where values meeting the condition remain unchanged, and others can be replaced with a specified value or another Series.

How does the where() function work in Pandas?

The where() function works by evaluating a condition on each element of the Series. If the condition is True, the original value is retained; otherwise, it can be replaced with a specified value or the corresponding value from another Series.

How can I use the where() function to replace values with NaN?

One common use of the where() function is to replace values with NaN. You can achieve this by specifying np.nan as the replacement value for the values that do not meet the specified condition.

How can I replace values based on multiple conditions using where()?

You can replace values based on multiple conditions using the where() function by combining these conditions using logical operators like & (and), | (or), and ~ (not).

Is it possible to replace values with a function using where()?

It is possible to replace values with a function using the where() function in Pandas. You can provide a callable (such as a function or a lambda function) as the other parameter in the where() function. This allows you to dynamically compute replacement values based on the condition.

How can I replace values with another Series using where()?

You can replace values in a Pandas Series with corresponding values from another Series using the where() function.

Conclusion

In this article, I have explained the where() function in Pandas is a powerful tool for conditionally replacing values in a Series based on specified conditions. Whether replacing values with a constant, using a function, or replacing values with another Series, where() allows for flexible and efficient data manipulation.

Happy Learning !!

References

Malli

Malli is an experienced technical writer with a passion for translating complex Python concepts into clear, concise, and user-friendly articles. Over the years, he has written hundreds of articles in Pandas, NumPy, Python, and takes pride in ability to bridge the gap between technical experts and end-users.