Use pandas.DataFrame.fillna() or pandas.DataFrame.replace()
methods to replace NaN or None values with Zero (0) in a column of string or integer type. NaN
stands for Not A Number and is one of the common ways to represent the missing value in the data. Sometimes None
is also used to represent missing values. In pandas handling missing data is very important before you process it.
In this article, I will explain how to replace NaN values with zero in a column of a pandas DataFrame using different ways.
Take Away:
Datafame.fillna()
is used to replace NaN/None with any values.DataFrame.replace()
does find and replace. It finds NaN values and replaces them with a specific value.NaN
stands for Not A Number and is one of the common ways to represent the missing value in the data. Sometimes None also used.numpy.nan
is use to specify a NaN value. NaN is a type of float.
1. Quick Examples of Replace NaN with Zero
If you are in a hurry, below are some quick examples of how to replace nan values with zeros in pandas DataFrame.
# Below are quick examples
# Repalce NaN with zero on all columns
df2 = df.fillna(0)
# Repalce inplace
df.fillna(0,inplace=True)
# Replace on single column
df["Fee"] = df["Fee"].fillna(0)
# Replace on multiple columns
df[["Fee","Duration"]] = df[["Fee","Duration"]].fillna(0)
# Using replace()
df["Fee"] = df["Fee"].replace(np.nan, 0)
# Using replace()
df2 = df.replace(np.nan, 0)
Now, let’s create a DataFrame with a few rows and columns and execute some examples to learn replace nan values with zero in a column. Our DataFrame contains the column names Courses
, Fee
, Duration
, and Discount
and has some NaN values on a string and integer columns.
# Create pandas DataFrame
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Hadoop"],
'Fee' :[20000,25000, np.nan],
'Duration':[np.nan,'40days','35days'],
'Discount':[1000,np.nan,1500]
}
df = pd.DataFrame(technologies)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 20000.0 NaN 1000.0
1 PySpark 25000.0 40days NaN
2 Hadoop NaN 35days 1500.0
2. Replace NaN Values with Zero on pandas DataFrame
Use the DataFrame.fillna(0)
method to replace NaN/None values with the 0 value. It doesn’t change the object data but returns a new DataFrame.
# Repalce NaN with zero on all columns
df2 = df.fillna(0)
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 20000.0 0 1000.0
1 PySpark 25000.0 40days 0.0
2 Hadoop 0.0 35days 1500.0
You can do replace on current DataFrame object itself by using inplace
param.
# Repalce NaN with zero inplace
df = pd.DataFrame(technologies)
df.fillna(0,inplace=True)
print(df)
3. Replace NaN Values with Zero on a Single or Multiple Columns
Sometimes you may need to update NaN values with 0 on single or multiple columns of DataFrame, let’s see with an example.
# Replace on single column
df = pd.DataFrame(technologies)
df["Fee"] = df["Fee"].fillna(0)
print(df)
Yields below output. This replaces NaN with zero on the Fee
column.
# Output:
Courses Fee Duration Discount
0 Spark 20000.0 NaN 1000.0
1 PySpark 25000.0 40days NaN
2 Hadoop 0.0 35days 1500.0
You can do the same for multiple columns.
# Replace on multiple columns
df = pd.DataFrame(technologies)
df[["Fee","Duration"]] = df[["Fee","Duration"]].fillna(0)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 20000.0 0 1000.0
1 PySpark 25000.0 40days NaN
2 Hadoop 0.0 35days 1500.0
4. Replace NaN Values with Zeroes Using replace()
Alternatively, you can also use DataFrame.replace()
method to update NaN values with zero. This method takes a minimum of two params; first, a value you wanted to replace (np.nan in our case), and second a value you wanted to replace with (zero in our case). This works the same as fillna() method.
# Using replace
df = pd.DataFrame(technologies)
df["Fee"] = df["Fee"].replace(np.nan, 0)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 20000.0 NaN 1000.0
1 PySpark 25000.0 40days NaN
2 Hadoop 0.0 35days 1500.0
5. Using DataFrame.replace() on All Columns
You can also use df.replace(np.nan,0)
to replace all NaN values with zero.
# Using replace()
df = pd.DataFrame(technologies)
df2 = df.replace(np.nan, 0)
print(df2)
This replaces all columns of DataFrame with zero for Nan values.
6. Complete Example For Replace NaN Values with Zeroes in a Column
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Hadoop"],
'Fee' :[20000,25000, np.nan],
'Duration':[np.nan,'40days','35days'],
'Discount':[1000,np.nan,1500]
}
df = pd.DataFrame(technologies)
print(df)
# Repalce NaN with zero on all columns
df2 = df.fillna(0)
print(df2)
# Repalce inplace
df = pd.DataFrame(technologies)
df.fillna(0,inplace=True)
print(df)
# Replace on single column
df = pd.DataFrame(technologies)
df["Fee"] = df["Fee"].fillna(0)
print(df)
# Replace on multiple columns
df = pd.DataFrame(technologies)
df[["Fee","Duration"]] = df[["Fee","Duration"]].fillna(0)
print(df)
# Using replace()
df = pd.DataFrame(technologies)
df["Fee"] = df["Fee"].replace(np.nan, 0)
print(df)
# Using replace()
df = pd.DataFrame(technologies)
df2 = df.replace(np.nan, 0)
print(df2)
Conclusion
In this article, you have learned how to replace NaN values with zeroes in a column of a pandas DataFrame using DataFrame.fillna(), DataFrame.replace()
method. Also, you have learned how to replace NaN values with zeroes on single and multiple columns with examples.
Happy Learning !!
Related Articles
- How to Check If any Value is NaN in a Pandas DataFrame
- Combine Two Columns of Text in Pandas DataFrame
- How to Drop Rows with NaN Values in Pandas DataFrame
- Add an Empty Column to a Pandas DataFrame
- Pandas Select DataFrame Columns by Label or Index
- Pandas Series.replace() – Replace Values
- How to Replace String in pandas DataFrame
- Pandas Replace substring in DataFrame
- Pandas DataFrame replace() with examples
Excellent, Naveen sir truly superb explained… I’m learning python last 2 years but never found such great article. Thank a lot !!