Pandas – Check Any Value is NaN in DataFrame

By using isnull().values.any() method you can check if a pandas DataFrame contains NaN/None values in any cell (all rows & columns ). This method returns True if it finds NaN/None on any cell of a DataFrame, returns False when not found. In this article, I will explain how to check if any value is NaN in a pandas DataFrame.

NaN stands for Not A Number and is one of the common ways to represent the missing value in the data. One of the major problems in Data Analysis is the NaN value as having NaN the operations will have side effects hence it’s always a best practice to check if DataFrame has any missing data and replace them with values that make sense for example empty string or numeric zero.

1. Quick Examples of Check If any Value is NaN

If you are in a hurry, below are some quick examples of how to check if any value is nan in a pandas DataFrame.


# Below are a quick example
# Checking NaN on entire DataFrame
value = df.isnull().values.any()

# Checking on Single Column
value = df['Fee'].isnull().values.any()

# Checking on multiple columns
value = df[['Fee','Duration']].isnull().values.any()

# Counte NaN on entire DataFrame
result = df.isnull().sum()

# Counte NaN on single column of DataFrame
result = df['Fee'].isnull().sum()

# Counte NaN on selected columns of DataFrame
result = df[['Fee','Duration']].isnull().sum()

# Get Total Count of all Columns
count = df.isnull().sum().sum()
print('Number of NaN values present:' +str(count))

Now, let’s create a DataFrame with a few rows and columns and execute some examples and validate the output. Our DataFrame contains column names Courses, Fee, Duration, and Discount with some NaN values.


# Create Sample DataFrame
import pandas as pd
import numpy as np
technologies = ({
     'Courses':["Spark","Java","Hadoop","Python","pandas"],
     'Fee' :[20000,np.nan,26000,np.nan,24000],
     'Duration':['30days',np.nan,'35days','40days',np.nan],
     'Discount':[1000,np.nan,2500,2100,np.nan]
               })
df = pd.DataFrame(technologies)
print(df)

Yields below output.


  Courses      Fee Duration  Discount
0   Spark  20000.0   30days    1000.0
1    Java      NaN      NaN       NaN
2  Hadoop  26000.0   35days    2500.0
3  Python      NaN   40days    2100.0
4  pandas  24000.0      NaN       NaN

2. Check If any Value is NaN in pandas DataFrame

Use DataFrame.isnull().Values.any() method to check if there are any missing data in pandas DataFrame, missing data is represented as NaN or None values in DataFrame. When your data contains NaN or None, using this method returns the boolean value True otherwise returns False.


# Check accross all cell for NaN values
value = df.isnull().values.any()
print(value)
# Outputs: True

The above example checks all columns and returns True when it finds at least a single NaN/None value.

3. Check for NaN Values on Selected Columns

If you wanted to check if NaN values exist on selected columns (single or multiple), First select the columns and run the same method.


# Checking on Single Column
value = df['Fee'].isnull().values.any()
print(value)
# Outputs: True

# Checking on Single Column
value = df['Courses'].isnull().values.any()
print(value)
# Outputs: False

# Checking on multiple columns
value = df[['Fee','Duration']].isnull().values.any()
print(value)
# Outputs: True

3. Using DataFrame.isnull() Method

DataFrame.isnull() check if a value is present in a cell, if it finds NaN/None values it returns True otherwise it returns False for each cell.


# Using DataFrame.isnull() method
df2 = df['Fee'].isnull()
print(df2)

Yields below output.


0    False
1     True
2    False
3     True
4    False
Name: Fee, dtype: bool

4. Count the NaN Values on Single or Multiple DataFrame Columns

You can also count the NaN/None values present in the entire DataFrame, single or multiple columns.


# Counte NaN on entire DataFrame
result = df.isnull().sum()
print(result)
# Outputs
#Courses     0
#Fee         2
#Duration    2
#Discount    2
#dtype: int64

# Counte NaN on single column of DataFrame
result = df['Fee'].isnull().sum()
print(result)
# Outputs
#2

# Counte NaN on selected columns of DataFrame
result = df[['Fee','Duration']].isnull().sum()
print(result)
# Outputs
#Fee         2
#Duration    2
#dtype: int64

Note that when you use sum() on multiple columns or entire DataFrame it returns naN values count for each column.

5. Total Count NaN Values on Entire DataFrame

To get the combined total count of NaN values, use isnull().sum().sum() on DataFrame. The below example returns the total count of NaN values from all columns.


# To get the Count
count = df.isnull().sum().sum()
print('Number of NaN values present:' +str(df2))

Yields below output.


Number of NaN values present:6

6. Complete Example For Check If any Value NaN

Below is the complete example of how to check if any value is NaN in pandas DataFrame.


import pandas as pd
import numpy as np
technologies = ({
     'Courses':["Spark","Java","Hadoop","Python","pandas"],
     'Fee' :[20000,np.nan,26000,np.nan,24000],
     'Duration':['30days',np.nan,'35days','40days',np.nan],
     'Discount':[1000,np.nan,2500,2100,np.nan]
               })
df = pd.DataFrame(technologies)
print(df)

# Checking NaN on entire DataFrame
value = df.isnull().values.any()
print(value)

# Checking on Single Column
value = df['Fee'].isnull().values.any()
print(value)

# Checking on Single Column
value = df['Courses'].isnull().values.any()
print(value)

# Checking on multiple columns
value = df[['Fee','Duration']].isnull().values.any()
print(value)

# Using DataFrame.isnull() method
df2 = df['Fee'].isnull()
print(df2)

# Counte NaN on entire DataFrame
result = df.isnull().sum()
print(result)

# Counte NaN on single column of DataFrame
result = df['Fee'].isnull().sum()
print(result)

# Counte NaN on selected columns of DataFrame
result = df[['Fee','Duration']].isnull().sum()
print(result)

# To get the Count
count = df.isnull().sum().sum()
print('Number of NaN values present:' +str(count))

Conclusion

In this article, you have learned how to check if any value is NaN in the entire pandas DataFrame, on a single column or multiple columns using DataFrame.isnull().any(), and DataFrame.isnull().sum() method. Also, you have learned how to get the count of NaN values using DataFrame.isnull().sum().sum() method.

Happy Learning !!

You May Also Like

References

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

Pandas – Check Any Value is NaN in DataFrame