Pandas – Add an Empty Column to a DataFrame

There are multiple ways to add a new empty/blank column (single or multiple columns) to a pandas DataFrame by using assign operator, assign(), insert() and apply() methods. By using these you can add one or multiple empty columns with either NaN, None, Blank or Empty string values to all cells. In most cases, all these values are considered empty/missing. Base on your use case or need chose one of these.

Related: How to Drop Columns with NaN Values in pandas DataFrame

In this article, I will go over how to add single and multiple empty columns either in the beginning, middle, specified index, or at the end with NaN, None, or empty values to DataFrame.

Take Away:

  • Use assign() method to add multiple empty columns, each new column can takes different empty value.
  • Use reindex() method to add columns by shuffling. This can be used if you have few columns. By default this add all columns with NaN values.
  • Use insert() method to add new column at any position of the DataFrame. For example at the beginning, middle, end or any specified position.

1. Quick Examples of Adding an Empty Column

If you are in a hurry, below are some quick examples of how to add an empty column from pandas DataFrame.


# Below are quick example

# Add empty column to the DataFrame
df["Blank_Column"] = " "
df["NaN_Column"] = np.nan
df["None_Column"] = None

# Add an empty columns using the assign() method
df2 = df.assign(Blank_Column=" ", NaN_Column = np.nan, None_Column=None)

# Add multiple columns with NaN , uses columns param
df2 = df.reindex(columns = df.columns.tolist() + ["None_Column", "None_Column_2"])

# Add multiple columns with NaN, , uses axis param 
df2 = df.reindex(df.columns.tolist() + ["None_Column", "None_Column_2"],axis=1)

# Add multiple columns to the Beginning
df2 = df.reindex(columns=["None_Column", "None_Column_2"]+df.columns.tolist())

# Add multiple columns with NaN, , uses axis param 
df2 = df.reindex(columns=["Courses","None_Column", "None_Column_2","Fee"])

# Using insert(), add empty column at first position
df.insert(0,"Blank_Column", " ")

# Using apply() & lambda function
df["Blank_Column"] = df.apply(lambda _: ' ', axis=1)

Let’s create a pandas DataFrame with a few rows and columns and execute some examples to learn using add an empty column. Our DataFrame contains column names Courses, Fee, Duration, and Discount.


import pandas as pd
import numpy as np
technologies = ({
    'Courses':["Spark","PySpark"],
    'Fee' :[20000,25000]
               })
df = pd.DataFrame(technologies)
print(df)

Yields below output.


   Courses    Fee
0    Spark  20000
1  PySpark  25000

2. Add an Empty Column to DataFrame Using Assignment Operator

One simplest way to add an empty column to pandas DataFrame is by using an assignment operator, below examples add blank string np.nan and None columns. These all columns can be considered as empty in DataFrame.


# Add empty, Nan and None column DataFrame using assignment operator
df["Blank_Column"] = " "
df["NaN_Column"] = np.nan
df["None_Column"] = None

Yields below output.


   Courses    Fee Blank_Column  NaN_Column None_Column
0    Spark  20000                      NaN        None
1  PySpark  25000                      NaN        None

3. Add Multiple Empty Columns using the DataFrame.assign()

Using DataFrame.assign() method you can add multiple empty columns to the Pandas DataFrame, this method returns a new DataFrame after adding the specified empty columns to the existing DataFrame.


# Add an empty columns using the assign() method
df2 = df.assign(Blank_Column=" ", NaN_Column = np.nan, None_Column=None)
print(df2)

This yields the same output as above.

4. Add an Empty Column Using Dataframe.reindex()

In case if you wanted to add the same NaN empty value for all multiple columns, you can use DataFrame.reindex() method. This method takes the list of columns which includes existing and new columns to be added.

  • Note that this method requires either columns or index param, not providing these will add empty rows instead of columns.
  • Using this you can add NaN columns either beginning, ending, or in the middle to the DataFrame

# Add multiple columns with NaN , uses columns param
df2 = df.reindex(columns = df.columns.tolist() + ["None_Column", "None_Column_2"])
print(df2)

# Add multiple columns with NaN, , uses axis param 
df2 = df.reindex(df.columns.tolist() + ["None_Column", "None_Column_2"],index=1)
print(df2)

Yields below output.


   Courses    Fee  None_Column  None_Column_2
0    Spark  20000          NaN            NaN
1  PySpark  25000          NaN            NaN

To add empty columns in the middle use df.reindex(columns=["Courses","None_Column", "None_Column_2","Fee"])

This method also works for adding multiple new rows by using .reindex(rows=[…]).

5. Add an Empty Column by Index Using Dataframe.insert()

When you have hundreds of columns, the above methods are not easy to add an empty column at the specific index (any position). Use DataFrame.insert() method to add an empty column at any position on the pandas DataFrame.

This adds a column inplace on the existing DataFrame object.


# Using insert(), add empty column at first position
df.insert(0,"Blank_Column", " ")
print(df)

Yields below output.


  Blank_Column  Courses    Fee
0                 Spark  20000
1               PySpark  25000

6. Using DataFrame.apply() and Lambda Function

Finally, you can also add using DataFrame.apply() and lambda function.


# Using apply() & lambda function
df = pd.DataFrame(technologies)
df["Blank_Column"] = df.apply(lambda _: ' ', axis=1)
print(df)

Yields below output.


   Courses    Fee Blank_Column
0    Spark  20000             
1  PySpark  25000  

7. Complete Example To Add an Empty Column

Complete working example for reference.


import pandas as pd
import numpy as np
technologies = ({
    'Courses':["Spark","PySpark"],
    'Fee' :[20000,25000]
               })
df = pd.DataFrame(technologies)
print(df)

# Add empty column
df["Blank_Column"] = " "
df["NaN_Column"] = np.nan
df["None_Column"] = None
print(df)

# Add an empty columns using the assign() method
df = pd.DataFrame(technologies)
df2 = df.assign(Blank_Column=" ", NaN_Column = np.nan, None_Column=None)
print(df2)

# Add multiple columns with NaN , uses columns param
df2 = df.reindex(columns = df.columns.tolist() + ["None_Column", "None_Column_2"])
print(df2)

# Add multiple columns with NaN, , uses axis param 
df2 = df.reindex(df.columns.tolist() + ["None_Column", "None_Column_2"],axis=1)
print(df2)

# Add multiple columns to the Beginning
df2 = df.reindex(columns=["None_Column", "None_Column_2"]+df.columns.tolist())
print(df2)

# Add multiple columns with NaN, , uses axis param 
df2 = df.reindex(columns=["Courses","None_Column", "None_Column_2","Fee"])
print(df2)

# Using insert(), add empty column at first position
df.insert(0,"Blank_Column", " ")
print(df)

# Using apply() & lambda function
df = pd.DataFrame(technologies)
df["Blank_Column"] = df.apply(lambda _: ' ', axis=1)
print(df)

Conclusion

In this article, you have learned how to add one or multiple empty columns to a pandas DataFrame using assign(), reindex(), insert() and apply() method. Using these you can add a blank column either at the beginning, middle, or end of the DataFrame. Based on your need, you can use any of these approaches as these give you the same results.

Happy Learning !!

You May Also Like

References

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

Pandas – Add an Empty Column to a DataFrame