• Post author:
  • Post category:Pandas
  • Post last modified:November 28, 2024
  • Reading time:17 mins read
You are currently viewing Pandas – Check Any Value is NaN in DataFrame

By using isnull().values.any() method you can check if a pandas DataFrame contains NaN/None values in any cell (all rows & columns ). This method returns True if it finds NaN/None on any cell of a DataFrame, returns False when not found. In this article, I will explain how to check if any value is NaN in a pandas DataFrame.

Advertisements

NaN stands for Not A Number and is one of the common ways to represent the missing value in the data. One of the major problems in Data Analysis is the NaN value as having NaN the operations will have side effects hence it’s always a best practice to check if DataFrame has any missing data and replace them with values that make sense for example empty string or numeric zero.

Key Points –

  • Use .isna() or .isnull() on a DataFrame to identify NaN (missing) values, returning a DataFrame of the same shape with boolean values.
  • Use .isna().any().any() to quickly check if there is any NaN value in the entire DataFrame.
  • Applying .isna().any() along the columns axis reveals which columns contain at least one NaN value.
  • The .isna().sum().sum() approach provides the total count of NaN values across the entire DataFrame.
  • Use .isna().sum() to get a count of NaN values per column, useful for understanding which columns are most affected by missing data.
  • Checking for NaN values is often the first step in data cleaning, allowing you to handle missing values appropriately.

Quick Examples of Check If any Value is NaN

If you are in a hurry, below are some quick examples of how to check if any value is nan in a pandas DataFrame.


# Quick examples of check if any value is nan

# Checking NaN on entire DataFrame
value = df.isnull().values.any()

# Checking on Single Column
value = df['Fee'].isnull().values.any()

# Checking on multiple columns
value = df[['Fee','Duration']].isnull().values.any()

# Counte NaN on entire DataFrame
result = df.isnull().sum()

# Counte NaN on single column of DataFrame
result = df['Fee'].isnull().sum()

# Counte NaN on selected columns of DataFrame
result = df[['Fee','Duration']].isnull().sum()

# Get total count of all columns
count = df.isnull().sum().sum()
print('Number of NaN values present:' +str(count))

Now, let’s create a DataFrame with a few rows and columns and execute some examples and validate the output. Our DataFrame contains column names Courses, Fee, Duration, and Discount with some NaN values.


# Create Sample DataFrame
import pandas as pd
import numpy as np
technologies = ({
     'Courses':["Spark","Java","Hadoop","Python","pandas"],
     'Fee' :[20000,np.nan,26000,np.nan,24000],
     'Duration':['30days',np.nan,'35days','40days',np.nan],
     'Discount':[1000,np.nan,2500,2100,np.nan]
               })
df = pd.DataFrame(technologies)
print(df)

Yields below output.


# Output:
  Courses      Fee Duration  Discount
0   Spark  20000.0   30days    1000.0
1    Java      NaN      NaN       NaN
2  Hadoop  26000.0   35days    2500.0
3  Python      NaN   40days    2100.0
4  pandas  24000.0      NaN       NaN

Check If any Value is NaN in Pandas DataFrame

Use DataFrame.isnull().Values.any() method to check if there are any missing data in pandas DataFrame, missing data is represented as NaN or None values in DataFrame. When your data contains NaN or None, using this method returns the boolean value True otherwise returns False. After identifying the columns with NaN, sometimes you may want to replace NaN with zero value or replace NaN with a blank or empty string.


# Check accross all cell for NaN values
value = df.isnull().values.any()
print(value)
# Outputs: True

The above example checks all columns and returns True when it finds at least a single NaN/None value.

Check for NaN Values on Selected Columns

If you wanted to check if NaN values exist on selected columns (single or multiple), First select the columns and run the same method.


# Checking on Single Column
value = df['Fee'].isnull().values.any()
print(value)
# Outputs: True

# Checking on Single Column
value = df['Courses'].isnull().values.any()
print(value)

# Output: False

# Checking on multiple columns
value = df[['Fee','Duration']].isnull().values.any()
print(value)

# Output: True

Using DataFrame.isnull() Method

DataFrame.isnull() check if a value is present in a cell, if it finds NaN/None values it returns True otherwise it returns False for each cell.


# Using DataFrame.isnull() method
df2 = df['Fee'].isnull()
print(df2)

Yields below output.


# Output:
0    False
1     True
2    False
3     True
4    False
Name: Fee, dtype: bool

Count the NaN Values on Single or Multiple DataFrame Columns

You can also count the NaN/None values present in the entire DataFrame, single or multiple columns.


# Counte NaN on entire DataFrame
result = df.isnull().sum()
print(result)

# Output:
# Courses     0
# Fee         2
# Duration    2
# Discount    2
# dtype: int64

# Counte NaN on single column of DataFrame
result = df['Fee'].isnull().sum()
print(result)

# Output:
# 2

# Counte NaN on selected columns of DataFrame
result = df[['Fee','Duration']].isnull().sum()
print(result)

# Output:
# Fee         2
# Duration    2
# dtype: int64

Note that when you use sum() on multiple columns or entire DataFrame it returns naN values count for each column.

Total Count NaN Values on Entire DataFrame

To get the combined total count of NaN values, use isnull().sum().sum() on DataFrame. The below example returns the total count of NaN values from all columns.


# To get the Count
count = df.isnull().sum().sum()
print('Number of NaN values present:' +str(df2))

Yields below output.


# Output:
Number of NaN values present:6

Complete Example For Check If any Value NaN

Below is the complete example of how to check if any value is NaN in pandas DataFrame.


import pandas as pd
import numpy as np
technologies = ({
     'Courses':["Spark","Java","Hadoop","Python","pandas"],
     'Fee' :[20000,np.nan,26000,np.nan,24000],
     'Duration':['30days',np.nan,'35days','40days',np.nan],
     'Discount':[1000,np.nan,2500,2100,np.nan]
               })
df = pd.DataFrame(technologies)
print(df)

# Checking NaN on entire DataFrame
value = df.isnull().values.any()
print(value)

# Checking on Single Column
value = df['Fee'].isnull().values.any()
print(value)

# Checking on Single Column
value = df['Courses'].isnull().values.any()
print(value)

# Checking on multiple columns
value = df[['Fee','Duration']].isnull().values.any()
print(value)

# Using DataFrame.isnull() method
df2 = df['Fee'].isnull()
print(df2)

# Counte NaN on entire DataFrame
result = df.isnull().sum()
print(result)

# Counte NaN on single column of DataFrame
result = df['Fee'].isnull().sum()
print(result)

# Counte NaN on selected columns of DataFrame
result = df[['Fee','Duration']].isnull().sum()
print(result)

# To get the Count
count = df.isnull().sum().sum()
print('Number of NaN values present:' +str(count))

FAQ on Check Any Value is NaN in Pandas DataFrame

How do I check if there are any NaN values in a DataFrame?

To check if there are any NaN values in a Pandas DataFrame, you can use the isna() function combined with the any() method.

How can I check for NaN values column-wise?

To check for NaN values column-wise in a Pandas DataFrame, you can use the isna() function combined with the any() method.

How can I check for NaN values row-wise?

To check for NaN values row-wise in a Pandas DataFrame, you can use the isna() function combined with the any() method along the row axis (axis=1).

What is the most efficient way to check for NaN in large DataFrames?

Using df.isna().values.any() is efficient as it directly checks the DataFrame’s underlying NumPy array for NaN values.

How does any() behave with MultiIndex DataFrames?

For MultiIndex DataFrames, any() can operate along specific levels or axes. The behavior is consistent with flat DataFrames but may require specifying the axis.

How do I count the number of columns or rows with any NaN values?

To count the number of columns or rows in a Pandas DataFrame that contain at least one NaN value, you can use the isna() function combined with any() and sum().

Conclusion

In this article, you have learned how to check if any value is NaN in the entire pandas DataFrame, on a single column or multiple columns using DataFrame.isnull().any(), and DataFrame.isnull().sum() method. Also, you have learned how to get the count of NaN values using DataFrame.isnull().sum().sum() method.

Happy Learning!!

References