By using isnull().values.any()
method you can check if a pandas DataFrame contains NaN
/None
values in any cell (all rows & columns ). This method returns True
if it finds NaN/None on any cell of a DataFrame, returns False
when not found. In this article, I will explain how to check if any value is NaN in a pandas DataFrame.
NaN
stands for Not A Number and is one of the common ways to represent the missing value in the data. One of the major problems in Data Analysis is the NaN value as having NaN the operations will have side effects hence it’s always a best practice to check if DataFrame has any missing data and replace them with values that make sense for example empty string or numeric zero.
1. Quick Examples of Check If any Value is NaN
If you are in a hurry, below are some quick examples of how to check if any value is nan in a pandas DataFrame.
# Below are the quick examples
# Checking NaN on entire DataFrame
value = df.isnull().values.any()
# Checking on Single Column
value = df['Fee'].isnull().values.any()
# Checking on multiple columns
value = df[['Fee','Duration']].isnull().values.any()
# Counte NaN on entire DataFrame
result = df.isnull().sum()
# Counte NaN on single column of DataFrame
result = df['Fee'].isnull().sum()
# Counte NaN on selected columns of DataFrame
result = df[['Fee','Duration']].isnull().sum()
# Get Total Count of all Columns
count = df.isnull().sum().sum()
print('Number of NaN values present:' +str(count))
Now, let’s create a DataFrame with a few rows and columns and execute some examples and validate the output. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
with some NaN values.
# Create Sample DataFrame
import pandas as pd
import numpy as np
technologies = ({
'Courses':["Spark","Java","Hadoop","Python","pandas"],
'Fee' :[20000,np.nan,26000,np.nan,24000],
'Duration':['30days',np.nan,'35days','40days',np.nan],
'Discount':[1000,np.nan,2500,2100,np.nan]
})
df = pd.DataFrame(technologies)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 20000.0 30days 1000.0
1 Java NaN NaN NaN
2 Hadoop 26000.0 35days 2500.0
3 Python NaN 40days 2100.0
4 pandas 24000.0 NaN NaN
2. Check If any Value is NaN in pandas DataFrame
Use DataFrame.isnull().Values.any()
method to check if there are any missing data in pandas DataFrame, missing data is represented as NaN or None values in DataFrame. When your data contains NaN or None, using this method returns the boolean value True
otherwise returns False
. After identifying the columns with NaN, sometimes you may want to replace NaN with zero value or replace NaN with a blank or empty string.
# Check accross all cell for NaN values
value = df.isnull().values.any()
print(value)
# Outputs: True
The above example checks all columns and returns True when it finds at least a single NaN/None value.
3. Check for NaN Values on Selected Columns
If you wanted to check if NaN values exist on selected columns (single or multiple), First select the columns and run the same method.
# Checking on Single Column
value = df['Fee'].isnull().values.any()
print(value)
# Outputs: True
# Checking on Single Column
value = df['Courses'].isnull().values.any()
print(value)
# Output: False
# Checking on multiple columns
value = df[['Fee','Duration']].isnull().values.any()
print(value)
# Output: True
3. Using DataFrame.isnull() Method
DataFrame.isnull()
check if a value is present in a cell, if it finds NaN/None values it returns True otherwise it returns False for each cell.
# Using DataFrame.isnull() method
df2 = df['Fee'].isnull()
print(df2)
Yields below output.
# Output:
0 False
1 True
2 False
3 True
4 False
Name: Fee, dtype: bool
4. Count the NaN Values on Single or Multiple DataFrame Columns
You can also count the NaN/None values present in the entire DataFrame, single or multiple columns.
# Counte NaN on entire DataFrame
result = df.isnull().sum()
print(result)
# Output:
# Courses 0
# Fee 2
# Duration 2
# Discount 2
# dtype: int64
# Counte NaN on single column of DataFrame
result = df['Fee'].isnull().sum()
print(result)
# Output:
# 2
# Counte NaN on selected columns of DataFrame
result = df[['Fee','Duration']].isnull().sum()
print(result)
# Output:
# Fee 2
# Duration 2
# dtype: int64
Note that when you use sum() on multiple columns or entire DataFrame it returns naN values count for each column.
5. Total Count NaN Values on Entire DataFrame
To get the combined total count of NaN values, use isnull().sum().sum()
on DataFrame. The below example returns the total count of NaN values from all columns.
# To get the Count
count = df.isnull().sum().sum()
print('Number of NaN values present:' +str(df2))
Yields below output.
# Output:
Number of NaN values present:6
6. Complete Example For Check If any Value NaN
Below is the complete example of how to check if any value is NaN in pandas DataFrame.
import pandas as pd
import numpy as np
technologies = ({
'Courses':["Spark","Java","Hadoop","Python","pandas"],
'Fee' :[20000,np.nan,26000,np.nan,24000],
'Duration':['30days',np.nan,'35days','40days',np.nan],
'Discount':[1000,np.nan,2500,2100,np.nan]
})
df = pd.DataFrame(technologies)
print(df)
# Checking NaN on entire DataFrame
value = df.isnull().values.any()
print(value)
# Checking on Single Column
value = df['Fee'].isnull().values.any()
print(value)
# Checking on Single Column
value = df['Courses'].isnull().values.any()
print(value)
# Checking on multiple columns
value = df[['Fee','Duration']].isnull().values.any()
print(value)
# Using DataFrame.isnull() method
df2 = df['Fee'].isnull()
print(df2)
# Counte NaN on entire DataFrame
result = df.isnull().sum()
print(result)
# Counte NaN on single column of DataFrame
result = df['Fee'].isnull().sum()
print(result)
# Counte NaN on selected columns of DataFrame
result = df[['Fee','Duration']].isnull().sum()
print(result)
# To get the Count
count = df.isnull().sum().sum()
print('Number of NaN values present:' +str(count))
Conclusion
In this article, you have learned how to check if any value is NaN in the entire pandas DataFrame, on a single column or multiple columns using DataFrame.isnull().any()
, and DataFrame.isnull().sum()
method. Also, you have learned how to get the count of NaN values using DataFrame.isnull().sum().sum()
method.
Happy Learning !!
Related Articles
- How to Combine Two Series into pandas DataFrame
- Pandas Remap Values in Column with a Dictionary (Dict)
- Pandas Check Column Contains a Value in DataFrame
- Check Values of Pandas Series is Unique
- Pandas Check If DataFrame is Empty | Examples
- Pandas – Check If a Column Exists in DataFrame
- How to Check Pandas Version?
- Pandas Drop Columns with NaN or None Values
- Pandas Drop Rows with NaN Values in DataFram
- Pandas Replace Values based on Condition
- Pandas Replace Column value in DataFrame
- Remove NaN From Pandas Series
- Pandas Replace Blank/Empty String with NaN values
- Pandas – Replace NaN Values with Zero in a Column
- Count NaN Values in Pandas DataFrame
- Pandas Series.fillna() function explained
- Pandas DataFrame.fillna() function explained