In pandas, you can add a column with a default value to the existing DataFrame by using df[]
, assign()
, and insert()
functions. DataFrame.assign()
returns a new Dataframe after adding a column with default values to the existing DataFrame. Use Dataframe.insert()
function to insert a column on the existing DataFrame with default values. In this article, I will explain how to add a column with default value in pandas DataFrame with examples.
1. Quick Examples of Add Column with Default Value
If you are in a hurry, below are some quick examples of adding a column with a default value on DataFrame.
# Below are quick example
# Examples 1: use DataFrame.assign() function
df2 = df.assign(Tutors = ['William', 'Henry', 'Michael', 'John'])
# Examples 2: Add new column to the DataFrame
tutors = ['William', 'Henry', 'Michael', 'John']
df2 = df.assign(Tutors=tutors)
# Examples 3: Add new column with default value
# using DataFrame.assign() function
df2 = df.assign(Tutors='NAN')
# Examples 4: Use df[] operator
df['Percent'] = ['5%','10%','15%','20%']
# Examples 5: Add new column with default value
# Using df[ ] operator
df['Percent'] = 'NAN'
# Examples 6: Add column with default value
# Using DataFrame.insert() function
df.insert(4, "Percent", "10%", allow_duplicates=False)
Now, Let’s create Pandas DataFrame using data from a Python dictionary, where the columns are Courses
, Fee
, Duration
and Discount
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r2 PySpark 25000 40days 2300
r3 Python 22000 35days 1200
r4 pandas 30000 50days 2000
2. Add Column with Default Value Using DataFrame.assign()
DataFrame.assign()
function is used to add a column with the default value to the Pandas DataFrame, this function returns a DataFrame after adding a column to the existing DataFrame.
Below is the syntax of the assign()
function.
# Syntax of DataFrame.assign()
DataFrame.assign(**kwargs)
Let’s add a column "Tutors”
to the DataFrame with the default value NaN. Using assign()
you cannot modify the existing DataFrame in place instead it returns a DataFrame after adding a column.
# Add new column with default value
# using DataFrame.assign() function
df2 = df.assign(Tutors='NAN')
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount Tutors
r1 Spark 20000 30days 1000 NAN
r2 PySpark 25000 40days 2300 NAN
r3 Python 22000 35days 1200 NAN
r4 pandas 30000 50days 2000 NAN
3. Add New Column with Default Value Using df[ ] Operator
Using df[]
operator, you can add a column with a default value to Pandas DataFrame. This is the best example when you want to add a new column to DataFrame.
Below is the syntax of the df[]
operator.
# Syntax of df[] operator
df[col_name]=value
Let’s add a column "Percent"
as a list and pass them into df[]
operator which will add a column with a default values to the given DataFrame.
# Use df[] operator
df['Percent'] = ['5%','10%','15%','20%']
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount Percent
r1 Spark 20000 30days 1000 5%
r2 PySpark 25000 40days 2300 10%
r3 Python 22000 35days 1200 15%
r4 pandas 30000 50days 2000 20%
Similarly, you can use df[] operator to add column with the same value to all rows of the existing DataFrame.
# Add new column with default value
# Using df[ ] operator
df['Percent'] = 'NAN'
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount Percent
r1 Spark 20000 30days 1000 NAN
r2 PySpark 25000 40days 2300 NAN
r3 Python 22000 35days 1200 NAN
r4 pandas 30000 50days 2000 NAN
4. Add Column with Default Value Using DataFrame.insert()
DataFrame.insert() function you can insert a column with a default value to Pandas DataFrame at any position. Using this function you can specify the index where you would like to add a column with a default value.
# Add column with default value
# using DataFrame.insert() function
df.insert(4, "Percent", "10%", allow_duplicates=False)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount Percent
r1 Spark 20000 30days 1000 10%
r2 PySpark 25000 40days 2300 10%
r3 Python 22000 35days 1200 10%
r4 pandas 30000 50days 2000 10%
5. Complete Example For Add Column with Default Value
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
# use DataFrame.assign() function
df2 = df.assign(Tutors = ['William', 'Henry', 'Michael', 'John'])
print(df2)
# Add new column to the DataFrame
tutors = ['William', 'Henry', 'Michael', 'John']
df2 = df.assign(Tutors=tutors)
print(df2)
# Add new column with default value
# using DataFrame.assign() function
df2 = df.assign(Tutors='NAN')
print(df2)
# Use df[] operator
df['Percent'] = ['5%','10%','15%','20%']
print(df)
# Add new column with default value
# Using df[ ] operator
df['Percent'] = 'NAN'
print(df)
# Add column with default value
# using DataFrame.insert() function
df.insert(4, "Percent", "10%", allow_duplicates=False)
print(df)
6. Conclusion
In this article, I have explained how to add column with a default value to the existing Pandas DataFrame by using df[]
, DataFrame.assing()
, and DataFrame.insert()
e.t.c. Also learned insert()
is used to insert a column with a default value at any position of the DataFrame.
Happy Learning !!
Related Articles
- Pandas Add Multiple Columns to DataFrame
- Pandas Add Constant Column to DataFrame
- Pandas Add or Insert Row to DataFrame
- Pandas Add Column to DataFrame
- Add Column Name to Pandas Series
- Pandas Add Column Names to DataFrame
- Pandas Insert List into Cell of DataFrame
- Pandas – How to Change Position of a Column
- Add an Empty Column to a Pandas DataFrame
- Pandas Add Column based on Another Column
- Pandas DataFrame insert() Function