The Pandas DataFrame.insert()
function is used to insert a column into the DataFrame, this function updates the existing DataFrame with a column. In this article, I will explain how to use the pandas DataFrame insert()
function and using this syntax & parameters how to insert a column at any position of the DataFrame with examples.
1. Quick Examples of DataFrame insert() Function
If you are in a hurry, below are some quick examples of pandas DataFrame insert() function.
# Below are the quick examples
# Example 1: Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])
# Example 2: Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )
# Example 3: Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)
# Example 4: Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)
# Example 5: Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")
# Example 6: Insert a new column into the dataframe
df = pd.DataFrame(technologies,index=index_labels)
df.insert(4, 'Percent', '5%')
# Example 7: Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))
2. Syntax of DataFrame.insert()
Following is the syntax of DataFrame.insert()
function.
# Syntax of DataFrame.insert()
DataFrame.insert(loc, column, value, allow_duplicates=_NoDefault.no_default)
2.1 Parameters of insert()
Following are the parameters of the pandas DataFrme insert()
function.
loc
– int: An integer which is specifying the location of a column where you want to insert a new column.column
– str, number, or hashable object: A string specifying the label of the inserted column.value
– Scalar, Series, or array-like: Any datatype value, which you are going to insert.allow_duplicates
– bool, optional, default lib.no_default: Its default value is False, it checks whether the inserted column with the same name already exists or not.
2.2 Return value of insert()
Returns a new DataFrame with a column added.
Now, Let’s create Pandas DataFrame using data from a Python dictionary, where the columns are Courses
, Fee
, Duration
and Discount
.
# Return value of insert()
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r2 PySpark 25000 40days 2300
r3 Python 22000 35days 1200
r4 pandas 30000 50days 2000
3. Use DataFrame.insert() Function
You can use DataFrame.insert()
function to insert a DataFrame column at any position of the existing DataFrame. Generally, any inserts happen at the end of the DataFrame but this function gives the flexibility to insert it at the beginning, in the middle, or at any column index of the DataFrame. This example adds a Tutors
column at the beginning of the DataFrame.
Note that in pandas, the Index starts from zero. insert()
function updates the existing DataFrame object with the column.
# Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])
print(df)
# Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )
print(df)
# Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)
print(df)
Yields below output.
# Output:
Tutors Courses Fee Duration Discount
r1 William Spark 20000 30days 1000
r2 Henry PySpark 25000 40days 2300
r3 Michael Python 22000 35days 1200
r4 John pandas 30000 50days 2000
4. Insert New Multiple Columns into the DataFrame
By using DataFrame.insert()
function you can also insert multiple columns to Pandas DataFrame at any position. Using this you can specify the index where you would like to insert multiple columns. The below example adds multiple columns at the first position (Index 0)
and fifth position (Index 4)
.
# Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)
print(df)
Yields below output.
# Output:
Tutors Courses Fee Duration Discount Percent
r1 William Spark 20000 30days 1000 5%
r2 Henry PySpark 25000 40days 2300 3%
r3 Michael Python 22000 35days 1200 4%
r4 John pandas 30000 50days 2000 2%
5. Insert an Empty Column into the DataFrame
Use DataFrame.insert()
function to insert an empty column at any position on the pandas DataFrame. This adds a column inplace on the existing DataFrame object.
# Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")
print(df)
Yields below output.
# Output:
Blank_Column Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r2 PySpark 25000 40days 2300
r3 Python 22000 35days 1200
r4 pandas 30000 50days 2000
6. Use DataFrame.insert() & pandas.series() Function
Similarly, follow the below function to insert DataFrame at any position of the existing DataFrame. This example inserts a Tutors
column at the beginning of the DataFrame. Notice that pandas uses index alignment in case of value from type Series.
# Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))
print(df)
Yields below output.
# Output:
tutors Courses Fee Duration Discount
r1 William Spark 20000 30days 1000
r2 Henry PySpark 25000 40days 2300
r3 Michael Python 22000 35days 1200
r4 John pandas 30000 50days 2000
7. Complete Example DataFrame insert() Function
# Example DataFrame insert() Function
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
# Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])
print(df)
# Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )
print(df)
# Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)
print(df)
# Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)
print(df)
# Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")
print(df)
# Insert a new column into the dataframe
df = pd.DataFrame(technologies,index=index_labels)
df.insert(4, 'Percent', '5%')
print(df)
# Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))
print(df)
9. Conclusion
In this article, I have explained how to insert a column to the existing Pandas DataFrame by using DataFrame.insert()
function. insert()
is used to insert a column at any position of the DataFrame.
Happy Learning !!
Related Articles
- Pandas Add Multiple Columns to DataFrame
- Pandas Add Constant Column to DataFrame
- Pandas Add Column to DataFrame
- How to Get Pandas Columns Count
- Pandas Insert List into Cell of DataFrame
- Pandas – How to Change Position of a Column
- Pandas – Add an Empty Column to a DataFrame
- Pandas Count Rows with Condition
- Pandas Add or Insert Row to DataFrame
- How to Get Pandas Columns Count