Pandas – Replace NaN with Blank/Empty String

By using replace() or fillna() methods you can replace NaN values with Blank/Empty string in Pandas DataFrame. NaN stands for Not A Number and is one of the common ways to represent the missing data value in Python/Pandas DataFrame. Sometimes we would be required to convert/replace any missing values with the values that make sense like replacing with zero’s for numeric columns and blank or empty for string-type columns.

In this panda DataFrame article, I will explain how to convert single or multiple (all columns from the list) NaN columns values to blank/empty strings using several ways with examples.

1. Quick Examples of Replace NaN to Empty/Blank String

If you are in a hurry, below are some of the quick examples of how to replace NaN with a blank/empty string in Pandas DataFrame.


# Below are quick examples 
# Replace all Nan values to empty string
df2 = df.replace(np.nan, '', regex=True)
print(df2)

# Using multiple columns 
df2 = df[['Courses','Fee' ]] = df[['Courses','Fee' ]].fillna('')
print(df2)

# Using pandas.DataFrame.fillna() to replace nan values 
df2 = df.fillna("")
print(df2)

# Using pandas replace nan with null 
df2 = df.fillna('', inplace=True)
print(df2)

# Pandas single column using replace nan empty string 
df2 = df.Courses.fillna('')
print(df2)

# Using Courses column replace nan with Zeros
df2 = df['Courses']=df['Courses'].fillna(0)
print(df2)

# Using Discount column to replace nan with Zeros
df2 = df['Discount']=df['Discount'].fillna(0)
print(df2)

# Remove the nan and fill the empty string
df2 = df.Courses.replace(np.nan,'',regex = True)
print(df2)

# Remove the nan and fill some values
df2 = df.Courses.replace(np.nan,'value',regex = True)
print(df2)

Now, let’s create a DataFrame with a few rows and columns and execute some examples and validate the results. Our DataFrame contains column names Courses, Fee, Duration and Discount.


import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark",np.nan,"Hadoop","Python","pandas",np.nan,"Java"],
    'Fee' :[20000,25000, np.nan,22000,24000,np.nan,22000],
    'Duration':[np.nan,'40days','35days', np.nan,'60days','50days','55days'],
    'Discount':[1000,np.nan,1500,np.nan,2500,2100,np.nan]
              }
df = pd.DataFrame(technologies)
print(df)

Yields below output.


  Courses      Fee Duration  Discount
0   Spark  20000.0      NaN    1000.0
1     NaN  25000.0   40days       NaN
2  Hadoop      NaN   35days    1500.0
3  Python  22000.0      NaN       NaN
4  pandas  24000.0   60days    2500.0
5     NaN      NaN   50days    2100.0
6    Java  22000.0   55days       NaN

2. Convert Nan to Empty String in Pandas

Use df.replace(np.nan,'',regex=True) method to replace all NaN values to an empty string in the Pandas DataFrame column.


# All DataFrame replace empty string
df2 = df.replace(np.nan, '', regex=True)
print(df2)

Yields below output.


  Courses      Fee Duration Discount
0   Spark  20000.0            1000.0
1          25000.0   40days         
2  Hadoop            35days   1500.0
3  Python  22000.0                  
4  pandas  24000.0   60days   2500.0
5                    50days   2100.0
6    Java  22000.0   55days         

3. Multiple Columns Replace Empty String

In order to replace NaN values with Blank strings on multiple columns or all columns from a list, use df[['Courses','Fee']] = df[['Courses','Fee']].fillna(''). This replaces NaN values on Courses and Fee column.


# Using multiple columns 
df2 = df[['Courses','Fee' ]] = df[['Courses','Fee' ]].fillna('')
print(df2)

Yields below output.


  Courses      Fee
0   Spark  20000.0
1          25000.0
2  Hadoop         
3  Python  22000.0
4  pandas  24000.0
5                 
6    Java  22000.0

4. Using pandas.DataFrame.fillna() to NaN/Null Values With Empty String

Using pandas.DataFrame.fillna("") to Replace NaN/Null values with an empty string. Replace each NaN in pandas.DataFrame with an empty string.


# Using pandas.DataFrame.fillna() to nan values 
df2 = df.fillna("")
print(df2)

Yields below output.


  Courses      Fee Duration Discount
0   Spark  20000.0            1000.0
1          25000.0   40days         
2  Hadoop            35days   1500.0
3  Python  22000.0                  
4  pandas  24000.0   60days   2500.0
5                    50days   2100.0
6    Java  22000.0   55days         

5. pandas.DataFrame.fillna() Add inplace=True

If you notice above after applying fillna() function, it returns a new DataFrame, In order to update the current/referring DataFrame in place use df.fillna('',inplace=True). When using this, fillna() method returns None type.


# Using pandas replace nan with null
df2 = df.fillna('', inplace=True)
print(df2)

Yields below output.


None

6. Replacing NaN to Empty String on a Specific Column

If you want to fill a single column, you can use df.Courses.fillna('').


# Pandas single column using replace nan empty string 
df2 = df.Courses.fillna('')
print(df2)

Yields below output.


0     Spark
1          
2    Hadoop
3    Python
4    pandas
5          
6      Java
Name: Courses, dtype: object

7. Replace NaN with Zeros in Pandas

These examples replace NaN values with zeroes in a column of Pandas DataFrame.


# Using Courses column replace nan with Zeros
df2 = df['Courses']=df['Courses'].fillna(0)
print(df2)

# Using Discount column to replace nan with Zeros
df2 = df['Discount']=df['Discount'].fillna(0)
print(df2)

Yields below output.


0     Spark
1         0
2    Hadoop
3    Python
4    pandas
5         0
6      Java
Name: Courses, dtype: object

8. Remove the NaN and Fill the Empty String

Use df.Courses.replace(np.nan,'',regex=True) to remove the NaN and fill the empty string.


# Remove the nan and fill the empty string
df2 = df.Courses.replace(np.nan,'',regex = True)
print(df2)

Yields below output.


0     Spark
1          
2    Hadoop
3    Python
4    pandas
5          
6      Java
Name: Courses, dtype: object

9. Remove the NaN and Fill some Values

Use df.Courses.replace(np.nan,'value',regex=True) to remove the NaN and fill Value.


# Remove the nan and fill some values
df2 = df.Courses.replace(np.nan,'value',regex = True)
print(df2)

Yields below output.


0     Spark
1     value
2    Hadoop
3    Python
4    pandas
5     value
6      Java
Name: Courses, dtype: object

Conclusion

In this article, you have learned how to replace NaN with blank/empty strings in Pandas using DataFrame.fillna(), DataFrame.replace() functions, you have also learned how to replace single and multiple columns.

Happy Learning !!

You May Also Like

References

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

Pandas – Replace NaN with Blank/Empty String