• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:16 mins read
You are currently viewing Pandas Replace NaN Values with Zero in a Column

Use pandas.DataFrame.fillna() or pandas.DataFrame.replace() methods to replace all NaN or None values with Zeros(0) of the entire DataFrame. NaN stands for Not A Number and is one of the common ways to represent the missing value in the data. Sometimes None is also used to represent missing values. In pandas handling missing data is very important before you process it.

In this article, I will explain how to replace NaN values with zeros in single/multiple columns of a pandas DataFrame using multiple ways.

Key Points-

  • Datafame.fillna() is used to replace NaN/None with any values.
  • DataFrame.replace() does find and replace. It finds NaN values and replaces them with a specific value.
  • NaN stands for Not A Number and is one of the common ways to represent the missing value in the data. Sometimes None is also used.
  • numpy.nan is used to specify a NaN value. NaN is a type of float.

1. Quick Examples of Replace NaN with Zero

If you are in a hurry, below are some quick examples of replacing nan values with zeros in Pandas DataFrame.


# Below are the quick examples

# Example 1: Repalce NaN with zero on all columns 
df2 = df.fillna(0)

# Example 2: Repalce inplace 
df.fillna(0,inplace=True)

# Example 3: Replace on single column
df["Fee"] = df["Fee"].fillna(0)

# Example 4: Replace on multiple columns
df[["Fee","Duration"]] = df[["Fee","Duration"]].fillna(0)

# Example 5: Using replace()
df["Fee"] = df["Fee"].replace(np.nan, 0)

# Example 6: Using replace()
df2 = df.replace(np.nan, 0)

Now, let’s create a DataFrame with a few rows and columns and execute some examples to learn to replace nan values with zero in a column. Our DataFrame contains the column names Courses, Fee, Duration, and Discount and has some NaN values on a string and integer columns.


# Create pandas DataFrame
import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Hadoop"],
    'Fee' :[20000,25000, np.nan],
    'Duration':[np.nan,'40days','35days'],
    'Discount':[1000,np.nan,1500]
               }
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)

Yields below output.

Pandas Replace NaN Zero

2. Replace NaN Values with Zero on pandas DataFrame

You can use the DataFrame.fillna(0) method on the DataFrame to replace all NaN/None values with the zero values. It doesn’t change the existing DataFrame instead it returns a copy of the DataFrame.


# Repalce NaN with zero on all columns 
df2 = df.fillna(0)
print("After replacing NaN values with zero:\n", df2)

Yields below output.

Pandas Replace NaN Zero

You can modify the existing DataFrame itself by using inplace param.


# Repalce NaN with zero inplace
df = pd.DataFrame(technologies)
df.fillna(0,inplace=True)
print("After replacing NaN values with zero:\n", df)

3. Replace NaN Values with Zero on Single or Multiple Columns

Sometimes you may need to update NaN values with zeros on single or multiple columns of DataFrame, let’s apply the fillna() function on the specified column of DataFrame to replace all NaN values of that particular column with zeros.


# Replace on single column
df = pd.DataFrame(technologies)
df["Fee"] = df["Fee"].fillna(0)
print("After replacing NaN values of specified column with zero:\n", df)

Yields below output. This replaces NaN with zero on the Fee column.


# Output:
# After replacing NaN values of specified column with zero:
   Courses      Fee Duration  Discount
0    Spark  20000.0      NaN    1000.0
1  PySpark  25000.0   40days       NaN
2   Hadoop      0.0   35days    1500.0

Alternatively, you can replace the NaN values of multiple columns of DataFrame with zeros by using the fillna() function.


# Replace on multiple columns
df = pd.DataFrame(technologies)
df[["Fee","Duration"]] = df[["Fee","Duration"]].fillna(0)
print("After replacing NaN values of multiple columns with zero:\n", df)

Yields below output.


# Output:
# After replacing NaN values of multiple columns with zero:
   Courses      Fee Duration  Discount
0    Spark  20000.0        0    1000.0
1  PySpark  25000.0   40days       NaN
2   Hadoop      0.0   35days    1500.0

4. Replace NaN Values with Zeroes Using replace()

Alternatively, you can use DataFrame.replace() method to update NaN values with zero. This method takes a minimum of two params; first, a value you want to replace (np.nan in our case), and second a value you want to replace with (zero in our case). This method works the same as the fillna() method.

In this example, I will replace the specified column of NaN values with zero.


# Using replace
df = pd.DataFrame(technologies)
df["Fee"] = df["Fee"].replace(np.nan, 0)
print("After replacing NaN values of specified column with zero:\n", df)

Yields below output.


# Output:
# After replacing NaN values of specified column with zero:
   Courses      Fee Duration  Discount
0    Spark  20000.0      NaN    1000.0
1  PySpark  25000.0   40days       NaN
2   Hadoop      0.0   35days    1500.0

5. Using DataFrame.replace() on All Columns

You can also use df.replace(np.nan,0) on the DataFrame to replace all NaN values with zeros. Let’s pass the np.nan along with 0 into this function, it will replace all NaN values with zeros.


# Using replace()
df = pd.DataFrame(technologies)
df2 = df.replace(np.nan, 0)
print("After replacing NaN values with zero:\n", df2)

This replaces all columns of DataFrame with zero for Nan values.

6. Complete Example For Replace NaN Values with Zeroes in a Column


import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Hadoop"],
    'Fee' :[20000,25000, np.nan],
    'Duration':[np.nan,'40days','35days'],
    'Discount':[1000,np.nan,1500]
               }
df = pd.DataFrame(technologies)
print(df)

# Repalce NaN with zero on all columns 
df2 = df.fillna(0)
print(df2)

# Repalce inplace 
df = pd.DataFrame(technologies)
df.fillna(0,inplace=True)
print(df)

# Replace on single column
df = pd.DataFrame(technologies)
df["Fee"] = df["Fee"].fillna(0)
print(df)

# Replace on multiple columns
df = pd.DataFrame(technologies)
df[["Fee","Duration"]] = df[["Fee","Duration"]].fillna(0)
print(df)

# Using replace()
df = pd.DataFrame(technologies)
df["Fee"] = df["Fee"].replace(np.nan, 0)
print(df)

# Using replace()
df = pd.DataFrame(technologies)
df2 = df.replace(np.nan, 0)
print(df2)

Frequently Asked Questions of Pandas Replace NaN Values with Zeroes

How can I replace NaN values with zeroes in a specific column?

You can use the fillna() method to replace NaN values in a specific column with zeroes. For example, df['column_name'] = df['column_name'].fillna(0)

Is there a way to replace NaN values with zeroes for the entire DataFrame?

You can use the fillna() method on the entire DataFrame to replace all NaN values with zeroes. For example, df = df.fillna(0)

How can I replace NaN values with zeroes only in specific columns?

You can specify the columns where you want to replace NaN values with zeroes and apply the fillna() function on that particular columns. For example, df[['column1', 'column2']] = df[['column1', 'column2']].fillna(0)

How do I replace NaN values with zeroes only in specific rows?

You can use boolean indexing to replace NaN values with zeroes in specific rows. For example, df.loc[df['condition_column'].isna(), 'column_to_replace'] = 0

What is the difference between fillna(0) and replace(np.nan, 0)?

Both methods give the same result. However, fillna(0) is more concise and commonly used for replacing NaN values with zeroes in Pandas DataFrames.

Conclusion

In this article, you have learned how to replace NaN values of the entire DataFrame with zeroes using the DataFrame.fillna(), DataFrame.replace() method. Also, learned using fillna() and replace () functions on single/multiple columns of DataFrame and how we can replace the NaN values of those particular single/multiple columns with zeros.

Happy Learning !!

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

This Post Has One Comment

  1. Sabhajeet Kumar

    Excellent, Naveen sir truly superb explained… I’m learning python last 2 years but never found such great article. Thank a lot !!

Comments are closed.