You can use the pandas.DataFrame.fillna()
or pandas.DataFrame.replace()
methods to replace all NaN or None values in an entire DataFrame with zeros (0). NaN, which stands for Not A Number, is a common representation for missing values in data. Sometimes None
is also used to represent missing values. In pandas handling missing data is very important before you process it.
In this article, I will explain replace NaN values with zeros in single/multiple columns of a pandas DataFrame using multiple ways.
Key Points-
Datafame.fillna()
is used to replace NaN/None with any values.DataFrame.replace()
does find and replace. It finds NaN values and replaces them with a specific value.numpy.nan
is used to specify a NaN value. NaN is a type of float.
Quick Examples of Replace NaN with Zero
Following are quick examples of replacing nan values with zeros in Pandas DataFrame.
# Quick examples of replace nan with zero
# Example 1: Repalce NaN with zero on all columns
df2 = df.fillna(0)
# Example 2: Repalce inplace
df.fillna(0,inplace=True)
# Example 3: Replace on single column
df["Fee"] = df["Fee"].fillna(0)
# Example 4: Replace on multiple columns
df[["Fee","Duration"]] = df[["Fee","Duration"]].fillna(0)
# Example 5: Using replace()
df["Fee"] = df["Fee"].replace(np.nan, 0)
# Example 6: Using replace()
df2 = df.replace(np.nan, 0)
To run some examples of replacing nan values with zeros in a column in Pandas DataFrame., let’s create a Pandas DataFrame using data from a dictionary.
# Create pandas DataFrame
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Hadoop"],
'Fee' :[20000,25000, np.nan],
'Duration':[np.nan,'40days','35days'],
'Discount':[1000,np.nan,1500]
}
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
Yields below output.
Replace NaN Values with Zero on DataFrame
You can use the DataFrame.fillna(0) method on the DataFrame to replace all NaN/None values with the zero values. It doesn’t change the existing DataFrame instead it returns a copy of the DataFrame.
# Repalce NaN with zero on all columns
df2 = df.fillna(0)
print("After replacing NaN values with zero:\n", df2)
Yields below output.
You can modify the existing DataFrame itself by using inplace
param.
# Repalce NaN with zero inplace
df = pd.DataFrame(technologies)
df.fillna(0,inplace=True)
print("After replacing NaN values with zero:\n", df)
Replace NaN Values with Zero on Single or Multiple Columns
Sometimes you may need to update NaN values with zeros on single or multiple columns of DataFrame, let’s apply the fillna()
function on the specified column of DataFrame to replace all NaN values of that particular column with zeros.
# Replace on single column
df = pd.DataFrame(technologies)
df["Fee"] = df["Fee"].fillna(0)
print("After replacing NaN values of specified column with zero:\n", df)
Yields below output. This replaces NaN with zero on the Fee
column.
# Output:
# After replacing NaN values of specified column with zero:
Courses Fee Duration Discount
0 Spark 20000.0 NaN 1000.0
1 PySpark 25000.0 40days NaN
2 Hadoop 0.0 35days 1500.0
Alternatively, you can replace the NaN values of multiple columns of DataFrame with zeros by using the fillna() function.
# Replace on multiple columns
df = pd.DataFrame(technologies)
df[["Fee","Duration"]] = df[["Fee","Duration"]].fillna(0)
print("After replacing NaN values of multiple columns with zero:\n", df)
Yields below output.
# Output:
# After replacing NaN values of multiple columns with zero:
Courses Fee Duration Discount
0 Spark 20000.0 0 1000.0
1 PySpark 25000.0 40days NaN
2 Hadoop 0.0 35days 1500.0
Replace NaN Values with Zeroes Using replace()
Alternatively, you can use DataFrame.replace() method to update NaN values with zero. This method takes a minimum of two params; first, a value you want to replace (np.nan in our case), and second a value you want to replace with (zero in our case). This method works the same as the fillna() method.
In this example, I will replace the specified column of NaN values with zero.
# Using replace
df = pd.DataFrame(technologies)
df["Fee"] = df["Fee"].replace(np.nan, 0)
print("After replacing NaN values of specified column with zero:\n", df)
Yields below output.
# Output:
# After replacing NaN values of specified column with zero:
Courses Fee Duration Discount
0 Spark 20000.0 NaN 1000.0
1 PySpark 25000.0 40days NaN
2 Hadoop 0.0 35days 1500.0
Using DataFrame.replace() on All Columns
You can also use df.replace(np.nan,0)
on the DataFrame to replace all NaN values with zeros. Let’s pass the np.nan
along with 0
into this function, it will replace all NaN values with zeros.
# Using replace()
df = pd.DataFrame(technologies)
df2 = df.replace(np.nan, 0)
print("After replacing NaN values with zero:\n", df2)
This replaces all columns of DataFrame with zero for Nan values.
Complete Example For Replace NaN Values with Zeroes in a Column
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Hadoop"],
'Fee' :[20000,25000, np.nan],
'Duration':[np.nan,'40days','35days'],
'Discount':[1000,np.nan,1500]
}
df = pd.DataFrame(technologies)
print(df)
# Repalce NaN with zero on all columns
df2 = df.fillna(0)
print(df2)
# Repalce inplace
df = pd.DataFrame(technologies)
df.fillna(0,inplace=True)
print(df)
# Replace on single column
df = pd.DataFrame(technologies)
df["Fee"] = df["Fee"].fillna(0)
print(df)
# Replace on multiple columns
df = pd.DataFrame(technologies)
df[["Fee","Duration"]] = df[["Fee","Duration"]].fillna(0)
print(df)
# Using replace()
df = pd.DataFrame(technologies)
df["Fee"] = df["Fee"].replace(np.nan, 0)
print(df)
# Using replace()
df = pd.DataFrame(technologies)
df2 = df.replace(np.nan, 0)
print(df2)
Frequently Asked Questions of Pandas Replace NaN Values with Zeroes
You can use the fillna()
method on the entire DataFrame to replace all NaN values with zeroes. For example, df = df.fillna(0)
You can specify the columns where you want to replace NaN values with zeroes and apply the fillna()
function on that particular columns.
You can use boolean indexing to replace NaN values with zeroes in specific rows. For example, df.loc[df['condition_column'].isna(), 'column_to_replace'] = 0
Both methods give the same result. However, fillna(0)
is more concise and commonly used for replacing NaN values with zeroes in Pandas DataFrames.
Conclusion
In this article, I have explained replace NaN values of the entire DataFrame with zeroes using the DataFrame.fillna()
, DataFrame.replace()
method. Also, learned using fillna()
and replace()
functions on single/multiple columns of DataFrame and how we can replace the NaN values of those particular single/multiple columns with zeros.
Happy Learning !!
Related Articles
- Combine Two Columns of Text in Pandas DataFrame
- How to Drop Rows with NaN Values in Pandas DataFrame
- Add an Empty Column to a Pandas DataFrame
- Pandas Replace substring in DataFrame
- Pandas Drop Columns with NaN or None Values
- Pandas Drop Rows with NaN Values in DataFram
- Pandas Replace Values based on Condition
- Pandas Replace Column value in DataFrame
- Remove NaN From Pandas Series
- Pandas Replace Blank/Empty String with NaN values
- Count NaN Values in Pandas DataFrame
- Pandas Series.fillna() function explained
Excellent, Naveen sir truly superb explained… I’m learning python last 2 years but never found such great article. Thank a lot !!