In pandas, you can add a column with a default value to the existing DataFrame by using df[]
, assign()
, and insert()
functions. DataFrame.assign()
returns a new Dataframe after adding a column with default values to the existing DataFrame. Use Dataframe.insert()
function to insert a column on the existing DataFrame with default values. In this article, I will explain how to add a column with default value in pandas DataFrame with examples.
Key Points –
- Specify the default value directly within the DataFrame constructor or when using methods like
DataFrame.insert()
orDataFrame.assign()
. - Ensure the default value aligns with the data type of the column.
- Use the assignment operator (
=
) to create a new column and assign the default value. - Utilize the
.loc
indexer to assign values to the new column based on conditions or criteria. - Consider efficiency when adding columns with default values, especially for large datasets, by optimizing code execution.
1. Quick Examples of Add Column with Default Value
If you are in a hurry, below are some quick examples of adding a column with a default value on DataFrame.
# Quick examples of add column with default Value
# Examples 1: use DataFrame.assign() function
df2 = df.assign(Tutors = ['William', 'Henry', 'Michael', 'John'])
# Examples 2: Add new column to the DataFrame
tutors = ['William', 'Henry', 'Michael', 'John']
df2 = df.assign(Tutors=tutors)
# Examples 3: Add new column with default value
# using DataFrame.assign() function
df2 = df.assign(Tutors='NAN')
# Examples 4: Use df[] operator
df['Percent'] = ['5%','10%','15%','20%']
# Examples 5: Add new column with default value
# Using df[ ] operator
df['Percent'] = 'NAN'
# Examples 6: Add column with default value
# Using DataFrame.insert() function
df.insert(4, "Percent", "10%", allow_duplicates=False)
Now, Let’s create Pandas DataFrame using data from a Python dictionary, where the columns are Courses
, Fee
, Duration
and Discount
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r2 PySpark 25000 40days 2300
r3 Python 22000 35days 1200
r4 pandas 30000 50days 2000
2. Add Column with Default Value Using DataFrame.assign()
DataFrame.assign()
function is used to add a column with the default value to the Pandas DataFrame, this function returns a DataFrame after adding a column to the existing DataFrame.
Below is the syntax of the assign()
function.
# Syntax of DataFrame.assign()
DataFrame.assign(**kwargs)
Let’s add a column "Tutors”
to the DataFrame with the default value NaN. Using assign()
you cannot modify the existing DataFrame in place instead it returns a DataFrame after adding a column.
# Add new column with default value
# using DataFrame.assign() function
df2 = df.assign(Tutors='NAN')
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount Tutors
r1 Spark 20000 30days 1000 NAN
r2 PySpark 25000 40days 2300 NAN
r3 Python 22000 35days 1200 NAN
r4 pandas 30000 50days 2000 NAN
3. Add New Column with Default Value Using df[ ] Operator
Using df[]
operator, you can add a column with a default value to Pandas DataFrame. This is the best example when you want to add a new column to DataFrame.
Below is the syntax of the df[]
operator.
# Syntax of df[] operator
df[col_name]=value
Let’s add a column "Percent"
as a list and pass them into df[]
operator which will add a column with a default values to the given DataFrame.
# Use df[] operator
df['Percent'] = ['5%','10%','15%','20%']
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount Percent
r1 Spark 20000 30days 1000 5%
r2 PySpark 25000 40days 2300 10%
r3 Python 22000 35days 1200 15%
r4 pandas 30000 50days 2000 20%
Similarly, you can use df[]
operator to add column with the same value to all rows of the existing DataFrame.
# Add new column with default value
# Using df[ ] operator
df['Percent'] = 'NAN'
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount Percent
r1 Spark 20000 30days 1000 NAN
r2 PySpark 25000 40days 2300 NAN
r3 Python 22000 35days 1200 NAN
r4 pandas 30000 50days 2000 NAN
4. Add Column with Default Value Using DataFrame.insert()
DataFrame.insert() function you can insert a column with a default value to Pandas DataFrame at any position. Using this function you can specify the index where you would like to add a column with a default value.
# Add column with default value
# using DataFrame.insert() function
df.insert(4, "Percent", "10%", allow_duplicates=False)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount Percent
r1 Spark 20000 30days 1000 10%
r2 PySpark 25000 40days 2300 10%
r3 Python 22000 35days 1200 10%
r4 pandas 30000 50days 2000 10%
5. Complete Example For Add Column with Default Value
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
# use DataFrame.assign() function
df2 = df.assign(Tutors = ['William', 'Henry', 'Michael', 'John'])
print(df2)
# Add new column to the DataFrame
tutors = ['William', 'Henry', 'Michael', 'John']
df2 = df.assign(Tutors=tutors)
print(df2)
# Add new column with default value
# using DataFrame.assign() function
df2 = df.assign(Tutors='NAN')
print(df2)
# Use df[] operator
df['Percent'] = ['5%','10%','15%','20%']
print(df)
# Add new column with default value
# Using df[ ] operator
df['Percent'] = 'NAN'
print(df)
# Add column with default value
# using DataFrame.insert() function
df.insert(4, "Percent", "10%", allow_duplicates=False)
print(df)
Frequently Asked Questions on Add Column with Default Value
Adding a default value allows you to initialize new columns with a predefined value, ensuring consistency and facilitating further data manipulation and analysis.
Add a column with a default value by directly assigning the default value to the new column, either during DataFrame creation or by using methods like DataFrame.insert() or DataFrame.assign().
You can use conditional statements with methods like .loc[]
or .apply()
to set different default values based on specific conditions or criteria for each row.
You can modify the default value of a column by reassigning values to the column using standard DataFrame assignment operations or by applying functions to update the values based on specific conditions.
Default values should be compatible with the data type of the column. Common data types include integers, floats, strings, booleans, and datetime objects, among others. Ensure consistency between the default value and the expected data type of the column.
Conclusion
In this article, I have explained how to add column with a default value to the existing Pandas DataFrame by using df[]
, DataFrame.assing()
, and DataFrame.insert()
e.t.c. Also learned insert()
is used to insert a column with a default value at any position of the DataFrame.
Happy Learning !!
Related Articles
- Pandas Add Multiple Columns to DataFrame
- Pandas Add Constant Column to DataFrame
- Pandas Add or Insert Row to DataFrame
- Pandas Add Column to DataFrame
- Add Column Name to Pandas Series
- Pandas Add Column Names to DataFrame
- Pandas Insert List into Cell of DataFrame
- Pandas – How to Change Position of a Column
- Add an Empty Column to a Pandas DataFrame
- Pandas Add Column based on Another Column
- Pandas DataFrame insert() Function
- Pandas split column into two columns
- How to Change Column Name in Pandas
- Pandas Normalize Columns of DataFrame