In pandas, the insert()
function is used to insert a column into a DataFrame at a specified location. This function allows you to specify the exact position where the new column should be placed, which can be particularly useful for maintaining the desired column order in your DataFrame.
In this article, I will explain how to use the Pandas DataFrame insert()
function and using this syntax & parameters how to insert a column at any position of the DataFrame.
Key Points –
- The
insert()
function is used to add a new column to a DataFrame at a specific position. - You can specify the position (index) where the new column should be inserted, allowing for precise control over column order.
- The first argument in
insert()
is the position, followed by the name of the new column, and the values to populate it. - The
allow_duplicates
parameter can be set toTrue
to permit duplicate column names in the DataFrame. - The
insert()
method modifies the DataFrame in-place, meaning it doesn’t return a new DataFrame.
Quick Examples of DataFrame insert() Function
Below are some quick examples of Pandas DataFrame insert() function.
# Quick examples of dataframe insert()
# Example 1: Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])
# Example 2: Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )
# Example 3: Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)
# Example 4: Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)
# Example 5: Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")
# Example 6: Insert a new column into the dataframe
df = pd.DataFrame(technologies,index=index_labels)
df.insert(4, 'Percent', '5%')
# Example 7: Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))
Syntax of DataFrame.insert()
Following is the syntax of DataFrame.insert()
function.
# Syntax of DataFrame.insert()
DataFrame.insert(loc, column, value, allow_duplicates=_NoDefault.no_default)
2.1 Parameters of insert()
Following are the parameters of the pandas DataFrme insert()
function.
loc
– int: An integer which is specifying the location of a column where you want to insert a new column.column
– str, number, or hashable object: A string specifying the label of the inserted column.value
– Scalar, Series, or array-like: Any datatype value, which you are going to insert.allow_duplicates
– bool, optional, default lib.no_default: Its default value is False, it checks whether the inserted column with the same name already exists or not.
Return value of insert()
Returns a new DataFrame with a column added.
First, let’s create a DataFrame with a few rows and columns, execute these examples, and validate the results of the Pandas DataFrame.insert() function.
# Return value of insert()
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r2 PySpark 25000 40days 2300
r3 Python 22000 35days 1200
r4 pandas 30000 50days 2000
Use DataFrame.insert() Function
You can use the DataFrame.insert()
function to insert a new column at any position in an existing DataFrame. While columns are typically added at the end, this function provides the flexibility to insert them at the beginning, in the middle, or at any specified index. For example, the following code adds a Tutors
column at the beginning of the DataFrame.
Note that in pandas, the Index starts from zero. insert()
function updates the existing DataFrame object with the column.
# Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])
print(df)
# Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )
print(df)
# Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)
print(df)
Yields below output.
# Output:
Tutors Courses Fee Duration Discount
r1 William Spark 20000 30days 1000
r2 Henry PySpark 25000 40days 2300
r3 Michael Python 22000 35days 1200
r4 John pandas 30000 50days 2000
Insert New Multiple Columns into the DataFrame
By using DataFrame.insert()
function you can also insert multiple columns into a Pandas DataFrame at any specified position. This allows you to control the exact index where the new columns should be placed. In the following example, let’s insert two new columns: Tutors
and Percent
. We’ll insert Tutors
at position 0 (beginning of the DataFrame) and Percent
at position 5 (end of the DataFrame).
# Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)
print(df)
Yields below output.
# Output:
Tutors Courses Fee Duration Discount Percent
r1 William Spark 20000 30days 1000 5%
r2 Henry PySpark 25000 40days 2300 3%
r3 Michael Python 22000 35days 1200 4%
r4 John pandas 30000 50days 2000 2%
Insert an Empty Column into the DataFrame
Use DataFrame.insert()
function to insert an empty column at any position on the pandas DataFrame. This adds a column inplace on the existing DataFrame object.
# Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")
print(df)
Yields below output.
# Output:
Blank_Column Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r2 PySpark 25000 40days 2300
r3 Python 22000 35days 1200
r4 pandas 30000 50days 2000
Use DataFrame.insert() & pandas.series() Function
Similarly, follow the below function to insert DataFrame at any position of the existing DataFrame. This example inserts a Tutors
column at the beginning of the DataFrame. Notice that pandas uses index alignment in case of value from type Series.
# Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))
print(df)
Yields below output.
# Output:
tutors Courses Fee Duration Discount
r1 William Spark 20000 30days 1000
r2 Henry PySpark 25000 40days 2300
r3 Michael Python 22000 35days 1200
r4 John pandas 30000 50days 2000
Complete Example DataFrame insert() Function
# Example DataFrame insert() Function
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
# Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])
print(df)
# Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )
print(df)
# Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)
print(df)
# Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)
print(df)
# Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")
print(df)
# Insert a new column into the dataframe
df = pd.DataFrame(technologies,index=index_labels)
df.insert(4, 'Percent', '5%')
print(df)
# Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))
print(df)
Conclusion
In conclusion, the DataFrame.insert()
function in Pandas provides a convenient way to add new columns to an existing DataFrame at specific positions. This method offers flexibility by allowing you to precisely control the index where the new column should be inserted.
Happy Learning !!
Related Articles
- Pandas Add Multiple Columns
- Pandas Add Column to DataFrame
- How to Get Pandas Columns Count
- Pandas Insert List into Cell of DataFrame
- How to Change Position of a Column
- Pandas Count Rows with Condition
- Pandas Add or Insert Row to DataFrame
- How to Get Pandas Columns Count