Pandas isin() function exists in both DataFrame & Series which is used to check if the object contains the elements from list, Series, Dict. It returns the same as the caller object of booleans indicating if each row cell/element is in values.
When it is called on DataFrame it returns a Pandas DataFrame with a boolean value representing whether it has a value or not. When it is called on Pandas Series, it returns a Series of booleans indicating if each element is in values.
pandas isin() Key Points –
- isin() function exists in both DataFrame and Series.
- It returns the same object as the caller with boolean values.
- Represented as True when value present, otherwise False.
- By using Series.isin(), you can filter the DataFrame.
1. pandas isin() Syntax
Following is the syntax of the isin() function. This takes anything from iterable, Series, and Dict values as parameters.
# Syntax of isin() function
isin(values)
Values parameter accept the following.
values
– iterable, Series, DataFrame or dict
Let’s create a DataFrame and learn how to use isin() function with examples.
# Create a pandas DataFrame
import pandas as pd
df = pd.DataFrame({
'Courses' :['Spark','Python','Java'],
'Fee' :[22000,25000,23000,],
'Duration':['30days','50days','30days']
})
print("Create DataFrame\n",df)
Yields below output.
2. Series.isin() Example
pandas Series.isin()
function is used to filter the DataFrame rows that contain a list of values. When it is called on Series, it returns a Series of booleans indicating if each element is in values, True
when present, False
when not. You can pass this series to the DataFrame to filter the rows.
2.1. Using Single Value
The below example checks each element of the Courses
column contains the value Spark
, when present it represents True
. Returns rows that have Spark
in column Courses
. Using isin()
with a list containing a single value is essentially checking for equality with that value.
# Specific Value
df2=df[df['Courses'].isin(['Spark'])]
print("Specific value:\n",df2)
Yields below output.
2.1. isin() with List of Values
If you want to use the isin()
function with a list of values, you can provide that list within the parentheses. This checks if the column Courses
contains in the list of values by using Pandas isin()
. It returns rows where Courses
contains Spark
, Java
.
In the below example, df2
contains only the rows where the ‘Courses’ column is either ‘Spark’ or ‘Java’. The isin(['Spark','Java'])
condition creates a boolean mask, and only the rows with True
are selected in the filtered DataFrame.
# List of values
df2=df[df['Courses'].isin(['Spark','Java'])]
print("Filtered DataFrame:\n", df2)
# Output:
# Filtered DataFrame:
# Courses Fee Duration
# 0 Spark 22000 30days
# 2 Java 23000 30days
3. DataFrame.isin() Example
Below are examples of how to use DataFrame.isin() function.
3.1 isin() with list of values
When a Python list is passed as a parameter value to the Pandas DataFrame.isin()
function, it checks whether each cell value from DataFrame is present in the list, if found, shows True
otherwise False
(When a value is not present). The resultant DataFrame just contains boolean values.
# isin() with list of values
print(df.isin(['Spark','Python',23000,'50days']))
# Output:
# Courses Fee Duration
# 0 True False False
# 1 True False True
# 2 False True False
3.2 Using Dict
The above example doesn’t check values in a specific DataFrame column, In order to check the values in a specific column use the Python Dictionary object as param. When a python Dict is passed as a param to the isin()
, you should have a column name as the key and elements you wanted to check as Dict value. With this, you can check values in multiple columns.
# check by column name
print(df.isin({'Courses': ['Spark', 'Python']}))
# Output:
# Courses Fee Duration
# 0 True False False
# 1 True False False
# 2 False False False
3.3 Using Another DataFrame
You can also check with another DataFrame.
# Checks in another DataFrame
df2 = pd.DataFrame({
'Courses' :['C++','Python',],
'Fee' :[23000,25000,],
'Duration':['30days','55days']
})
print(df.isin(df2))
# Output:
# Courses Fee Duration
# 0 False False True
# 1 True True False
# 2 False False False
4. Complete Example of DataFrame & Series isin()
# Create a pandas DataFrame.
import pandas as pd
import numpy as np
df = pd.DataFrame({
'Courses' :['Spark','Python','Java'],
'Fee' :[22000,25000,23000,],
'Duration':['30days','50days','30days']
})
print(df)
# List of values
print(df.isin(['Spark','Python',23000,'50days']))
# Check by column
print(df.isin({'Courses': ['Spark', 'Python',23000]}))
# Using DataFrame
df2 = pd.DataFrame({
'Courses' :['C++','Python',],
'Fee' :[23000,25000,],
'Duration':['30days','55days']
})
print(df.isin(df2))
# Single value
df2=df[df['Courses'].isin(['Spark'])]
print(df2)
# Multiple values
df2=df[df['Courses'].isin(['Spark','Java'])]
print(df2)
Frequently Asked Questions on Pandas isin()
The isin()
function in pandas is used to filter rows of a DataFrame based on whether certain values exist in a particular column. It returns a boolean mask indicating whether each element in the specified column is contained in the provided list of values.
You can use isin()
with multiple columns in pandas. When using isin()
with multiple columns, you typically provide a dictionary where keys are column names, and values are lists of values to check for in each respective column.
You can use isin()
with a single value in pandas. When you use isin()
with a single value, it checks whether each element in the specified column is equal to that single value.
The isin()
function in pandas does not modify the original DataFrame. Instead, it returns a new DataFrame or Series containing boolean values that represent whether each element in the specified column(s) is contained in the provided list of values.
The isin()
function with various data types, including strings, numbers, and other data types. The isin()
function checks for equality, so as long as the values in the specified column match the type of the values you are checking against, it will work correctly.
You can use the isin()
function with another DataFrame for filtering by comparing a column in the first DataFrame with a column in the second DataFrame. The isin()
function will create a boolean mask based on whether the values in the specified column of the first DataFrame are present in the column of the second DataFrame.
Conclusion
In this article, you have learned isin()
function exists in both DataFrame & Series which is used to check if the object contains the elements from a list, Series, and Dict.
Related Articles
- Pandas NOT IN filter
- Pandas Series filter() Function
- Pandas Add Column with Default Value
- Pandas Filter Rows by Conditions
- Pandas Filter Rows Using IN Like SQL
- Pandas Select Rows Based on List Index
- Pandas Add Column with Default Value
- How to Use NOT IN Filter in Pandas
- How to Transpose() DataFrame in Pandas?
- Pandas Filter DataFrame by Substring criteria
- Apply Multiple Filters to Pandas DataFrame or Series
- Pandas Check Column Contains a Value in DataFrame