Pandas DataFrame insert() Function

In pandas, the insert() function is used to insert a column into a DataFrame at a specified location. This function allows you to specify the exact position where the new column should be placed, which can be particularly useful for maintaining the desired column order in your DataFrame.

Quick Examples of DataFrame insert() Function

Below are some quick examples of Pandas DataFrame insert() function.


# Quick examples of dataframe insert() 

# Example 1: Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])

# Example 2: Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )

# Example 3: Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)

# Example 4: Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)

# Example 5: Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")

# Example 6: Insert a new column into the dataframe
df = pd.DataFrame(technologies,index=index_labels)
df.insert(4, 'Percent', '5%')

# Example 7: Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))

Syntax of DataFrame.insert()

Following is the syntax of DataFrame.insert() function.


# Syntax of DataFrame.insert()
DataFrame.insert(loc, column, value, allow_duplicates=_NoDefault.no_default)

2.1 Parameters of insert()

Following are the parameters of the pandas DataFrme insert() function.

loc – int: An integer which is specifying the location of a column where you want to insert a new column.
column – str, number, or hashable object: A string specifying the label of the inserted column.
value – Scalar, Series, or array-like: Any datatype value, which you are going to insert.
allow_duplicates – bool, optional, default lib.no_default: Its default value is False, it checks whether the inserted column with the same name already exists or not.

Return value of insert()

Returns a new DataFrame with a column added.

First, let’s create a DataFrame with a few rows and columns, execute these examples, and validate the results of the Pandas DataFrame.insert() function.


# Return value of insert()
import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

Yields below output.


# Output:
    Courses    Fee Duration  Discount
r1    Spark  20000   30days      1000
r2  PySpark  25000   40days      2300
r3   Python  22000   35days      1200
r4   pandas  30000   50days      2000

Use DataFrame.insert() Function

You can use the DataFrame.insert() function to insert a new column at any position in an existing DataFrame. While columns are typically added at the end, this function provides the flexibility to insert them at the beginning, in the middle, or at any specified index. For example, the following code adds a Tutors column at the beginning of the DataFrame.

Note that in pandas, the Index starts from zero. insert() function updates the existing DataFrame object with the column.


# Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])
print(df)

# Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )
print(df)

# Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)
print(df)

Yields below output.


# Output:
     Tutors  Courses    Fee Duration  Discount
r1  William    Spark  20000   30days      1000
r2    Henry  PySpark  25000   40days      2300
r3  Michael   Python  22000   35days      1200
r4     John   pandas  30000   50days      2000

Insert New Multiple Columns into the DataFrame

By using DataFrame.insert() function you can also insert multiple columns into a Pandas DataFrame at any specified position. This allows you to control the exact index where the new columns should be placed. In the following example, let’s insert two new columns: Tutors and Percent. We’ll insert Tutors at position 0 (beginning of the DataFrame) and Percent at position 5 (end of the DataFrame).


# Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)
print(df)

Yields below output.


# Output:
     Tutors  Courses    Fee Duration  Discount Percent
r1  William    Spark  20000   30days      1000      5%
r2    Henry  PySpark  25000   40days      2300      3%
r3  Michael   Python  22000   35days      1200      4%
r4     John   pandas  30000   50days      2000      2%

Insert an Empty Column into the DataFrame

Use DataFrame.insert() function to insert an empty column at any position on the pandas DataFrame. This adds a column inplace on the existing DataFrame object.


# Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")
print(df)

Yields below output.


# Output:
   Blank_Column  Courses    Fee Duration  Discount
r1                 Spark  20000   30days      1000
r2               PySpark  25000   40days      2300
r3                Python  22000   35days      1200
r4                pandas  30000   50days      2000

Use DataFrame.insert() & pandas.series() Function

Similarly, follow the below function to insert DataFrame at any position of the existing DataFrame. This example inserts a Tutors column at the beginning of the DataFrame. Notice that pandas uses index alignment in case of value from type Series.


# Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))
print(df)

Yields below output.


# Output:
     tutors  Courses    Fee Duration  Discount
r1  William    Spark  20000   30days      1000
r2    Henry  PySpark  25000   40days      2300
r3  Michael   Python  22000   35days      1200
r4     John   pandas  30000   50days      2000

Complete Example DataFrame insert() Function


# Example DataFrame insert() Function
import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

# Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])
print(df)

# Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )
print(df)

# Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)
print(df)

# Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)
print(df)

# Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")
print(df)

# Insert a new column into the dataframe
df = pd.DataFrame(technologies,index=index_labels)
df.insert(4, 'Percent', '5%')
print(df)

# Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))
print(df)

FAQ on Pandas DataFrame insert() Function

What does the insert() function do in Pandas?

The insert() function in Pandas allows you to add a new column to a DataFrame at a specific position. It provides control over where the new column is placed, unlike the typical method of adding columns, which appends them to the end.

How can I insert a new column at the beginning of a DataFrame?

To insert a new column at the beginning of a Pandas DataFrame, you can use the insert() function with loc=0. The loc parameter specifies the position where the new column will be added, and 0 corresponds to the first position.

How can I insert a new column at the beginning of a DataFrame?

To insert a new column at the beginning of a Pandas DataFrame, use the insert() method with loc=0. The loc parameter determines the position, and 0 specifies the first column.

What happens if allow_duplicates is set to True?

If allow_duplicates=True is set in the insert() function, Pandas allows multiple columns with the same name to exist in the DataFrame. This can be useful in specific cases but may lead to confusion when accessing columns by name.

Is insert() an in-place operation?

The insert() function in Pandas is an in-place operation. It directly modifies the original DataFrame by adding a new column at the specified position.

Conclusion

In conclusion, the DataFrame.insert() function in Pandas provides a convenient way to add new columns to an existing DataFrame at specific positions. This method offers flexibility by allowing you to precisely control the index where the new column should be inserted.

Happy Learning !!

References

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.info.html

Quick Examples of DataFrame insert() Function

Syntax of DataFrame.insert()

2.1 Parameters of insert()

Return value of insert()

Use DataFrame.insert() Function

Insert New Multiple Columns into the DataFrame

Insert an Empty Column into the DataFrame

Use DataFrame.insert() & pandas.series() Function

Complete Example DataFrame insert() Function

FAQ on Pandas DataFrame insert() Function

Conclusion

Related Articles

References