In Pandas, you can add multiple columns to an existing DataFrame using the assign()
function, which updates the DataFrame with the new columns. Alternatively, you can use DataFrame.insert()
to add multiple columns, but this function returns a new DataFrame after the columns are added. In this article, I will explain several ways to add multiple columns to pandas DataFrame.
Key Points –
- Pandas provide methods like
assign()
andinsert()
for adding multiple columns to a DataFrame. assign()
returns a new DataFrame with added columns, leaving the original DataFrame unchanged.insert()
inserts columns at specified positions within the DataFrame and also returns a new DataFrame.- Pass a dictionary to
assign()
with column names as keys and corresponding values as values.
Quick Examples of Add Multiple Columns
Following are quick examples of adding multiple columns to DataFrame.
# Quick examples of add multiple columns
# Examples 1: Add multiple columns to dataframe using df[]
Tutors = ['William', 'Henry', 'Michael', 'John']
Percent = ['5%','2%','4%','3%']
df['Tutors'] = Tutors
df['Percent'] = Percent
# Examples 2: Add multiple columns using Dataframe.assign()
df2 = df.assign(Tutors = ['William', 'Henry', 'Michael', 'John'],
Percent = ['5%','2%','4%','3%'])
# Examples 3: Add multiple columns using Dataframe.inser()
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(4, "Percent", ['5%','2%','4%','3%'], True)
To run some examples of adding multiple columns to Pandas DataFrame, let’s create Pandas DataFrame.
# Create DataFrame
import pandas as pd
import numpy as np
technologies= ({
'Courses':["Spark","PySpark","Hadoop","Pandas"],
'Fee': [22000,25000,30000,35000],
'Duration':['30days','50days','40days','35days'],
'Discount':[1000,2000,2500,1500]
})
df = pd.DataFrame(technologies)
print("Create DataFrame:\n",df)
Yields below output.
Add Multiple Columns to a DataFrame Using df[]
Using df[]
notation, we can add multiple columns to a Pandas DataFrame. This method is particularly effective when we need to add one or more columns. For example, to add the "Tutors"
and "Percent"
columns, we can pass them as lists to df[]
, which will incorporate these columns into the existing DataFrame.
# Add multiple columns to a dataframe
# Using df[]
Tutors = ['William', 'Henry', 'Michael', 'John']
Percent = ['5%','2%','4%','3%']
df['Tutors'] = Tutors
df['Percent'] = Percent
print("DataFrame with added columns:\n", df)
Yields below output.
Add Multiple Columns Using Dataframe.assign()
DataFrame.assign()
is also used to add/append multiple columns to the Pandas DataFrame, this function returns a new DataFrame after adding multiple columns to the existing DataFrame. Now let’s add multiple columns "Tutors”
, and "Percent"
to the DataFrame. Using assign()
we cannot modify the existing DataFrame in place instead it returns a new DataFrame after adding multiple columns.
# Add multiple columns using Dataframe.assign()
df2 = df.assign(Tutors = ['William', 'Henry', 'Michael', 'John'],
Percent = ['5%','2%','4%','3%'])
print(df2)
Yields the same output as above.
Add Multiple Columns Using insert()
The DataFrame.insert()
function is another method for adding multiple columns to a Pandas DataFrame at any specified position. This function allows you to define the index where you want to insert the columns. The example below demonstrates how to add multiple columns at the first
position (index 0
) and the fifth
position (index 4
).
Note that in pandas, the Index starts from zero. insert()
function updates the existing DataFrame object with the new multiple columns.
# Add multiple columns using Dataframe.inser()
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(4, "Percent", ['5%','2%','4%','3%'], True)
print(df)
Yields below output.
# Output:
Tutors Courses Fee Duration Percent Discount
0 William Spark 22000 30days 5% 1000
1 Henry PySpark 25000 50days 2% 2000
2 Michael Hadoop 30000 40days 4% 2500
3 John Pandas 35000 35days 3% 1500
Complete Example of Adding Multiple Columns
import pandas as pd
import numpy as np
technologies= ({
'Courses':["Spark","PySpark","Hadoop","Pandas"],
'Fee': [22000,25000,30000,35000],
'Duration':['30days','50days','40days','35days'],
'Discount':[1000,2000,2500,1500]
})
df = pd.DataFrame(technologies)
print(df)
# Add multiple columns to dataframe using df[]
Tutors = ['William', 'Henry', 'Michael', 'John']
Percent = ['5%','2%','4%','3%']
df['Tutors'] = Tutors
df['Percent'] = Percent
print(df)
# Add multiple columns using Dataframe.assign()
df2 = df.assign(Tutors = ['William', 'Henry', 'Michael', 'John'],
Percent = ['5%','2%','4%','3%'])
print(df2)
# Add multiple columns using Dataframe.inser()
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(4, "Percent", ['5%','2%','4%','3%'], True)
print(df)
FAQ on Add Multiple Columns to DataFrame
You can add multiple columns to a Pandas DataFrame using various methods.
Direct Assignment: Use the df[]
notation to assign values or lists to new column names.
insert() method: Utilize the insert()
method to add columns at specific positions.
assign() method: The assign()
method allows you to add multiple columns at once. It creates a new DataFrame with the added columns.
Concatenation: Use pd.concat()
to concatenate the original DataFrame with a new DataFrame containing the additional columns.
You can use the assign()
method to add multiple columns to a Pandas DataFrame. The assign()
method allows you to create a new DataFrame with additional columns, leaving the original DataFrame unchanged.
Direct Assignment: Convenient for adding columns sequentially. The columns are appended at the end of the DataFrame.
insert() method: Useful when you want to specify the position for new columns. It allows you to insert columns at specific locations.
There are several ways to add columns to a Pandas DataFrame. Besides the assign()
method and direct assignment with df[]
and the insert()
method
To add columns with specific data types to a Pandas DataFrame, you can specify the data type while assigning values to the new columns.
Conclusion
In this article, we explored various methods to add or append multiple columns to an existing Pandas DataFrame. We covered the use of df[]
, DataFrame.assign()
, and DataFrame.insert()
. Each method provides flexibility depending on whether you need to add columns to the end of the DataFrame or at a specific position.
Happy Learning !!
Related Articles
- Add Column based on Another Column
- Add Column Names to DataFrame
- Split pandas DataFrame
- Convert Column to Lowercase
- Split the column of DataFrame into two columns
- Split Pandas DataFrame by column value
- Pandas Filter DataFrame by Multiple Conditions
- Select pandas columns based on condition
- Add Constant Column to DataFrame
- Check Column Contains a Value in DataFrame
- Extract Column Value Based on Another Column
- Convert Row to Column Header in DataFrame