Pandas Add Column with Default Value

In pandas, you can add a column with a default value to the existing DataFrame by using df[], assign(), and insert() functions. DataFrame.assign() returns a new Dataframe after adding a column with default values to the existing DataFrame. Use Dataframe.insert() function to insert a column on the existing DataFrame with default values. In this article, I will explain how to add a column with default value in pandas DataFrame with examples.

1. Quick Examples of Add Column with Default Value

If you are in a hurry, below are some quick examples of adding a column with a default value on DataFrame.


# Below are quick example

# Examples 1: use DataFrame.assign() function
df2 = df.assign(Tutors = ['William', 'Henry', 'Michael', 'John'])

# Examples 2: Add new column to the DataFrame
tutors = ['William', 'Henry', 'Michael', 'John']
df2 = df.assign(Tutors=tutors)

# Examples 3: Add new column with default value 
# using DataFrame.assign() function
df2 = df.assign(Tutors='NAN')

# Examples 4: Use df[] operator
df['Percent'] = ['5%','10%','15%','20%']

# Examples 5: Add new column with default value 
# Using df[ ] operator
df['Percent'] = 'NAN'

# Examples 6: Add column with default value 
# using DataFrame.insert() function
df.insert(4, "Percent", "10%", allow_duplicates=False)

Now, Let’s create Pandas DataFrame using data from a Python dictionary, where the columns are CoursesFeeDuration and Discount


import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

Yields below output.


    Courses    Fee Duration  Discount
r1    Spark  20000   30days      1000
r2  PySpark  25000   40days      2300
r3   Python  22000   35days      1200
r4   pandas  30000   50days      2000

2. Add Column with Default Value Using DataFrame.assign()

DataFrame.assign() function is used to add a column with the default value to the Pandas DataFrame, this function returns a DataFrame after adding a column to the existing DataFrame.

Below is the syntax of the assign() function.


# Syntax of DataFrame.assign()
DataFrame.assign(**kwargs)

Let’s add a column "Tutors” to the DataFrame with the default value NaN. Using assign() you cannot modify the existing DataFrame in place instead it returns a DataFrame after adding a column.


# Add new column with default value 
# using DataFrame.assign() function
df2 = df.assign(Tutors='NAN')
print(df2)

Yields below output.


    Courses    Fee Duration  Discount Tutors
r1    Spark  20000   30days      1000    NAN
r2  PySpark  25000   40days      2300    NAN
r3   Python  22000   35days      1200    NAN
r4   pandas  30000   50days      2000    NAN

3. Add New Column with Default Value Using df[ ] Operator

Using df[] operator, you can add a column with a default value to Pandas DataFrame. This is the best example when you want to add a new column to DataFrame.

Below is the syntax of the df[] operator.


# Syntax of df[] operator
df[col_name]=value

Let’s add a column "Percent" as a list and pass them into df[] operator which will add a column with a default values to the given DataFrame.


# Use df[] operator
df['Percent'] = ['5%','10%','15%','20%']
print(df)

Yields below output.


    Courses    Fee Duration  Discount Percent
r1    Spark  20000   30days      1000      5%
r2  PySpark  25000   40days      2300     10%
r3   Python  22000   35days      1200     15%
r4   pandas  30000   50days      2000     20%

Similarly, you can use df[] operator to add column with the same value to all rows of the existing DataFrame.


# Add new column with default value 
# Using df[ ] operator
df['Percent'] = 'NAN'
print(df)

Yields below output.


    Courses    Fee Duration  Discount Percent
r1    Spark  20000   30days      1000     NAN
r2  PySpark  25000   40days      2300     NAN
r3   Python  22000   35days      1200     NAN
r4   pandas  30000   50days      2000     NAN

4. Add Column with Default Value Using DataFrame.insert()

DataFrame.insert() function you can insert a column with a default value to Pandas DataFrame at any position. Using this function you can specify the index where you would like to add a column with a default value.


# Add column with default value 
# using DataFrame.insert() function
df.insert(4, "Percent", "10%", allow_duplicates=False)
print(df)

Yields below output.


    Courses    Fee Duration  Discount Percent
r1    Spark  20000   30days      1000     10%
r2  PySpark  25000   40days      2300     10%
r3   Python  22000   35days      1200     10%
r4   pandas  30000   50days      2000     10%

5. Complete Example For Add Column with Default Value


import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

# use DataFrame.assign() function
df2 = df.assign(Tutors = ['William', 'Henry', 'Michael', 'John'])
print(df2)

# Add new column to the DataFrame
tutors = ['William', 'Henry', 'Michael', 'John']
df2 = df.assign(Tutors=tutors)
print(df2)

# Add new column with default value 
# using DataFrame.assign() function
df2 = df.assign(Tutors='NAN')
print(df2)

# Use df[] operator
df['Percent'] = ['5%','10%','15%','20%']
print(df)

# Add new column with default value 
# Using df[ ] operator
df['Percent'] = 'NAN'
print(df)

# Add column with default value 
# using DataFrame.insert() function
df.insert(4, "Percent", "10%", allow_duplicates=False)
print(df)

6. Conclusion

In this article, I have explained how to add column with a default value to the existing Pandas DataFrame by using df[], DataFrame.assing(), and DataFrame.insert() e.t.c. Also learned insert() is used to insert a column with a default value at any position of the DataFrame.

Happy Learning !!

References

Leave a Reply