• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:15 mins read
You are currently viewing Pandas Check Column Contains a Value in DataFrame

You can check if a column contains/exists a particular value (string/int), list of multiple values in pandas DataFrame by using pd.series(), in operator, pandas.series.isin(), str.contains() methods and many more. In this article, I will explain how to check if a column contains a particular value with examples. These return True when a value contains in a specified column, False when not found.

1. Quick Examples of Pandas Column Contains Particular value of DataFrame

If you are in a hurry, below are some quick examples of how to check if a pandas DataFrame column contains/exists a particular string value or a list of values.


# Quick examples of pandas column contains a value in DataFrame

# Check Column Contains a Value in DataFrame
print('Spark' in df['Courses'].unique())

# Check Column Contains a Value in DataFrame
print('Spark' in set(df['Courses']))

# Using DataFrame.values.
print('Spark' in df['Courses'].values)

# Check column contains Particular value of DataFrame 
# Using Pandas.Series.isin()
print(df['Courses'].isin(['Spark','Python']))

# Column contains in multiple strings.
print(df[df['Courses'].str.contains('ark')])

Now, let’s create a pandas DataFrame with a few rows and columns, execute these examples and validate results. Our DataFrame contains column names CoursesFeeDuration, and Discount.


# Create a DataFrame.
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print("Create DataFrame:\n", df)

Yields below output.

pandas contains column value

2. Check Column Contains a Value in DataFrame

Use in operator on a Series to check if a column contains/exists a string value in a pandas DataFrame. df['Courses'] returns a Series object with all values from column Courses, pandas.Series.unique will return unique values of the Series object. Uniques are returned in order of appearance. The unique technique is based on hash tables. in operators return True when a value is found in a Series object.

In the below example, the unique values in the ‘Courses’ column using the unique() method. Then, it checks if the string ‘Spark’ is present in the array of unique values and prints the result.


# Check the value of index by unique values.
print('Spark' in df['Courses'].unique())

# Output:
# True

We can use the in & not in operators on these values to check if a given element exists or not. For instance, first, create a set of unique values in the ‘Courses’ column using the set() function. Then, it checks if the string ‘Spark’ is present in the set of unique values and prints the result.


# Check the value of index by in parameter.
print('Spark' in set(df['Courses']))

# Output:
# True

You can also check using in operator with pandas.DataFrame.values. This returns numpy.ndarray. For instance, this program directly checks if the string ‘Spark’ is present in the underlying NumPy array (values) of the ‘Courses’ column. If ‘Spark’ is present in the values, the print statement will output True; otherwise, it will output False.


# Series can check the value in index by DataFrame.values.
print('Spark' in df['Courses'].values)

# Output:
# True

3. Using pandas.Series.isin() to Check Column Contains Value

Pandas.Series.isin() function is used to check whether a column contains a list of multiple values. It returns a boolean Series showing each element in the Series matches an element in the passed sequence of values exactly.

In the below example, contains_spark will be a boolean Series where each element indicates whether the corresponding value in the ‘Courses’ column is equal to ‘Spark’, or ‘Python’.


# Check column contains Particular value of DataFrame by Pandas.Series.isin()
contains_spark = df['Courses'].isin(['Spark','Python'])
print(contains_spark)

# Output:
# r1     True
# r2    False
# r3     True
# r4    False
# Name: Courses, dtype: bool

4. Series.Str.contains() to Check Part of a value in Column

You can see how we can determine a pandas column contains a particular value of DataFrame using Series.Str.contains(). This contains() function is used to test the pattern or regex is contained within a string of a Series or Index.


# Column contains particular value by multiple strings.
print(df[df['Courses'].str.contains('ark')])

# Output:
#     Courses    Fee Duration  Discount
# r1    Spark  20000   30days      1000
# r2  PySpark  25000   40days      2300

5. Complete examples of Checking Column Contains a Particular Value


# Create a DataFrame.
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

# Check the value of index by unique values.
print('Spark' in df['Courses'].unique())

# Check the value of index by in parameter.
print('Spark' in set(df['Courses']))

# Series can check the value in index by DataFrame.values.
print('Spark' in df['Courses'].values)

# Check column contains Particular value of DataFrame by Pandas.Series.isin()
print(df['Courses'].isin(['Spark','Python']))

# Column contains particular value by multiple strings.
print(df[df['Courses'].str.contains('ark')])

Frequently Asked Questions

How do I check if a specific value exists in a column of a Pandas DataFrame

To check if a specific value exists in a column of a Pandas DataFrame, you can use the isin() method.

Can I check if multiple values exist in a column?

You can check if multiple values exist in a column using the isin() method in pandas. For example, contains_values will be a boolean Series where each element indicates whether the corresponding value in the ‘Courses’ column is equal to either ‘Python’ or ‘Java’.

What if I want to check if a column contains a substring?

If you want to check if a column contains a substring, you can use the str.contains() method in pandas. For example, contains_substring will be a boolean Series where each element indicates whether the substring ‘Spark’ is present in the corresponding value of the ‘Courses’ column.

Can I perform a case-insensitive check?

You can perform a case-insensitive check using the str.contains() method in pandas. You can achieve this by setting the case parameter to False.

How can I negate the condition and filter rows where the column does not contain a specific value?

To negate the condition and filter rows where the column does not contain a specific value, you can use the ~ (tilde) operator along with the condition.

How can I check if any value in a column is missing (NaN)?

To check if any value in a column is missing (NaN), you can use the isna() method in pandas. For example, contains_missing will be a boolean Series where each element indicates whether the corresponding value in the ‘Courses’ column is missing (NaN).

Conclusion

In this article, you have learned how to check if a DataFrame column contains/exists a part of a value with examples by using in & not in operators, pandas.Series.isin() and also check if multiple elements exist in DataFrame.

References

Leave a Reply