Pandas Replace NaN with Blank/Empty String

  • Post author:
  • Post category:Pandas
  • Post last modified:January 9, 2024
  • Reading time:12 mins read

By using replace() or fillna() methods you can replace NaN values with Blank/Empty string in Pandas DataFrame. NaN stands for Not A Nuber and is one of the common ways to represent the missing data value in Python/Pandas DataFrame. Sometimes we would be required to convert/replace any missing values with the values that make sense like replacing with zero’s for numeric columns and blank or empty for string-type columns.

In this pandas DataFrame article, I will explain how to convert single or multiple (all columns from the list) NaN columns values to blank/empty strings using several ways with examples.

1. Quick Examples of Replace NaN to Empty/Blank String

If you are in a hurry, below are some quick examples of how to replace NaN with a blank/empty string in Pandas DataFrame.


# Below are the quick examples 

# Replace all Nan values to empty string
df2 = df.replace(np.nan, '', regex=True)
print(df2)

# Using multiple columns 
df2 = df[['Courses','Fee' ]] = df[['Courses','Fee' ]].fillna('')
print(df2)

# Using pandas.DataFrame.fillna() to replace nan values 
df2 = df.fillna("")
print(df2)

# Using pandas replace nan with null 
df2 = df.fillna('', inplace=True)
print(df2)

# Pandas single column using replace nan empty string 
df2 = df.Courses.fillna('')
print(df2)

# Using Courses column replace nan with Zeros
df2 = df['Courses']=df['Courses'].fillna(0)
print(df2)

# Using Discount column to replace nan with Zeros
df2 = df['Discount']=df['Discount'].fillna(0)
print(df2)

# Remove the nan and fill the empty string
df2 = df.Courses.replace(np.nan,'',regex = True)
print(df2)

# Remove the nan and fill some values
df2 = df.Courses.replace(np.nan,'value',regex = True)
print(df2)

Now, let’s create a DataFrame with a few rows and columns and execute some examples, and validate the results. Our DataFrame contains column names Courses, Fee, Duration and Discount.


import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark",np.nan,"Hadoop","Python","pandas",np.nan,"Java"],
    'Fee' :[20000,25000, np.nan,22000,24000,np.nan,22000],
    'Duration':[np.nan,'40days','35days', np.nan,'60days','50days','55days'],
    'Discount':[1000,np.nan,1500,np.nan,2500,2100,np.nan]
              }
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)

Yields below output.

Pandas NaN Empty String

2. Convert Nan to Empty String in Pandas

Use df.replace(np.nan,'',regex=True) method to replace all NaN values with an empty string in the Pandas DataFrame column.


# All DataFrame replace empty string
df2 = df.replace(np.nan, '', regex=True)
print("After replacing the NaN values with an empty string:\n", df2)

Yields below output.

Pandas NaN Empty String

Related: You can also replace an empty/blank string with NaN values.

3. Multiple Columns Replace Empty String

In order to replace NaN values with Blank strings on multiple columns or all columns from a list, use df[['Courses','Fee']] = df[['Courses','Fee']].fillna(''). This replaces NaN values on Courses and Fee column.


# Using multiple columns 
df2 = df[['Courses','Fee' ]] = df[['Courses','Fee' ]].fillna('')
print("After replacing the NaN values with an empty string:\n", df2)

Yields below output.


# Output:
# After replacing the NaN values with an empty string:
  Courses      Fee
0   Spark  20000.0
1          25000.0
2  Hadoop         
3  Python  22000.0
4  pandas  24000.0
5                 
6    Java  22000.0

4. Using fillna() to NaN/Null Values With Empty String

Use pandas.DataFrmae.fillna() to Replace NaN/Null values with an empty string. This replaces each NaN in Pandas DataFrame with an empty string.


# Using pandas.DataFrame.fillna() to nan values 
df2 = df.fillna("")
print("After replacing the NaN values with an empty string:\n", df2)

Yields below output.


# Output:
# After replacing the NaN values with an empty string:
  Courses      Fee Duration Discount
0   Spark  20000.0            1000.0
1          25000.0   40days         
2  Hadoop            35days   1500.0
3  Python  22000.0                  
4  pandas  24000.0   60days   2500.0
5                    50days   2100.0
6    Java  22000.0   55days         

5. fillna() with inplace=True

If you notice the above output after applying fillna() function, it returns a new DataFrame, In order to update the current/referring DataFrame in place use df.fillna('',inplace=True). When using this, fillna() method returns None type.


# Using pandas replace nan with null
df2 = df.fillna('', inplace=True)
print("After replacing the NaN values with an empty string:\n", df2)

Yields below output.


# Output:
None

6. Replacing NaN with Empty String on a Specific Column

If you want to fill a single column, you can use df.Courses.fillna('').


# Pandas single column using replace nan empty string 
df2 = df.Courses.fillna('')
print("After replacing the NaN values with an empty string:\n", df2)

Yields below output.


# Output:
# After replacing the NaN values with an empty string:
0     Spark
1          
2    Hadoop
3    Python
4    pandas
5          
6      Java
Name: Courses, dtype: object

7. Replace NaN with Zeros

These examples replace NaN values with zeroes in a column.


# Using Courses column replace nan with Zeros
df2 = df['Courses']=df['Courses'].fillna(0)
print("After replacing the NaN values with zeros:\n", df2)

# Using Discount column to replace nan with Zeros
df2 = df['Discount']=df['Discount'].fillna(0)
print("After replacing the NaN values with zeros:\n", df2)

Yields below output.


# Output:
# After replacing the NaN values with zeros:
0     Spark
1         0
2    Hadoop
3    Python
4    pandas
5         0
6      Java
Name: Courses, dtype: object

8. Remove the NaN and Fill the Empty String

Use df.Courses.replace(np.nan,'',regex=True) to remove the NaN and fill the empty string on a Courses column.


# Remove the nan and fill the empty string
df2 = df.Courses.replace(np.nan,'',regex = True)
print("After replacing NaN values with an empty string:\n", df2)

Yields below output.


# Output:
# After replacing NaN values with an empty string:
0     Spark
1          
2    Hadoop
3    Python
4    pandas
5          
6      Java
Name: Courses, dtype: object

9. Remove the NaN and Fill some Values

Use df.Courses.replace(np.nan,'value',regex=True) to remove the NaN and fill Value.


# Remove the nan and fill some values
df2 = df.Courses.replace(np.nan,'value',regex = True)
print("After replacing NaN values with specified value:\n", df2)

Yields below output.


# Output:
# After replacing NaN values with specified value:
0     Spark
1     value
2    Hadoop
3    Python
4    pandas
5     value
6      Java
Name: Courses, dtype: object

Conclusion

In this article, you have learned how to replace NaN with blank/empty strings in Pandas using DataFrame.fillna(), DataFrame.replace() functions, you have also learned how to replace single and multiple columns.

Happy Learning !!

References

Naveen (NNK)

Naveen (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

Leave a Reply