• Post author:
  • Post category:Pandas
  • Post last modified:November 29, 2024
  • Reading time:12 mins read
You are currently viewing Pandas DataFrame insert() Function

In pandas, the insert() function is used to insert a column into a DataFrame at a specified location. This function allows you to specify the exact position where the new column should be placed, which can be particularly useful for maintaining the desired column order in your DataFrame.

Advertisements

In this article, I will explain how to use the Pandas DataFrame insert() function and using this syntax & parameters how to insert a column at any position of the DataFrame.

Key Points –

  • The insert() function is used to add a new column to a DataFrame at a specific position.
  • You can specify the position (index) where the new column should be inserted, allowing for precise control over column order.
  • The first argument in insert() is the position, followed by the name of the new column, and the values to populate it.
  • The allow_duplicates parameter can be set to True to permit duplicate column names in the DataFrame.
  • The insert() method modifies the DataFrame in-place, meaning it doesn’t return a new DataFrame.

Quick Examples of DataFrame insert() Function

Below are some quick examples of Pandas DataFrame insert() function.


# Quick examples of dataframe insert() 

# Example 1: Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])

# Example 2: Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )

# Example 3: Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)

# Example 4: Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)

# Example 5: Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")

# Example 6: Insert a new column into the dataframe
df = pd.DataFrame(technologies,index=index_labels)
df.insert(4, 'Percent', '5%')

# Example 7: Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))

Syntax of DataFrame.insert()

Following is the syntax of DataFrame.insert() function.


# Syntax of DataFrame.insert()
DataFrame.insert(loc, column, value, allow_duplicates=_NoDefault.no_default)

2.1 Parameters of insert()

Following are the parameters of the pandas DataFrme insert() function.

  • loc – int: An integer which is specifying the location of a column where you want to insert a new column.
  • column – str, number, or hashable object: A string specifying the label of the inserted column.
  • value – Scalar, Series, or array-like: Any datatype value, which you are going to insert.
  • allow_duplicates – bool, optional, default lib.no_default: Its default value is False, it checks whether the inserted column with the same name already exists or not.

Return value of insert()

Returns a new DataFrame with a column added.

First, let’s create a DataFrame with a few rows and columns, execute these examples, and validate the results of the Pandas DataFrame.insert() function.


# Return value of insert()
import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

Yields below output.


# Output:
    Courses    Fee Duration  Discount
r1    Spark  20000   30days      1000
r2  PySpark  25000   40days      2300
r3   Python  22000   35days      1200
r4   pandas  30000   50days      2000

Use DataFrame.insert() Function

You can use the DataFrame.insert() function to insert a new column at any position in an existing DataFrame. While columns are typically added at the end, this function provides the flexibility to insert them at the beginning, in the middle, or at any specified index. For example, the following code adds a Tutors column at the beginning of the DataFrame.

Note that in pandas, the Index starts from zero. insert() function updates the existing DataFrame object with the column.


# Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])
print(df)

# Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )
print(df)

# Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)
print(df)

Yields below output.


# Output:
     Tutors  Courses    Fee Duration  Discount
r1  William    Spark  20000   30days      1000
r2    Henry  PySpark  25000   40days      2300
r3  Michael   Python  22000   35days      1200
r4     John   pandas  30000   50days      2000

Insert New Multiple Columns into the DataFrame

By using DataFrame.insert() function you can also insert multiple columns into a Pandas DataFrame at any specified position. This allows you to control the exact index where the new columns should be placed. In the following example, let’s insert two new columns: Tutors and Percent. We’ll insert Tutors at position 0 (beginning of the DataFrame) and Percent at position 5 (end of the DataFrame).


# Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)
print(df)

Yields below output.


# Output:
     Tutors  Courses    Fee Duration  Discount Percent
r1  William    Spark  20000   30days      1000      5%
r2    Henry  PySpark  25000   40days      2300      3%
r3  Michael   Python  22000   35days      1200      4%
r4     John   pandas  30000   50days      2000      2%

Insert an Empty Column into the DataFrame

Use DataFrame.insert() function to insert an empty column at any position on the pandas DataFrame. This adds a column inplace on the existing DataFrame object.


# Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")
print(df)

Yields below output.


# Output:
   Blank_Column  Courses    Fee Duration  Discount
r1                 Spark  20000   30days      1000
r2               PySpark  25000   40days      2300
r3                Python  22000   35days      1200
r4                pandas  30000   50days      2000

Use DataFrame.insert() & pandas.series() Function

Similarly, follow the below function to insert DataFrame at any position of the existing DataFrame. This example inserts a Tutors column at the beginning of the DataFrame. Notice that pandas uses index alignment in case of value from type Series.


# Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))
print(df)

Yields below output.


# Output:
     tutors  Courses    Fee Duration  Discount
r1  William    Spark  20000   30days      1000
r2    Henry  PySpark  25000   40days      2300
r3  Michael   Python  22000   35days      1200
r4     John   pandas  30000   50days      2000

Complete Example DataFrame insert() Function


# Example DataFrame insert() Function
import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

# Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])
print(df)

# Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )
print(df)

# Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)
print(df)

# Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)
print(df)

# Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")
print(df)

# Insert a new column into the dataframe
df = pd.DataFrame(technologies,index=index_labels)
df.insert(4, 'Percent', '5%')
print(df)

# Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))
print(df)

Conclusion

In conclusion, the DataFrame.insert() function in Pandas provides a convenient way to add new columns to an existing DataFrame at specific positions. This method offers flexibility by allowing you to precisely control the index where the new column should be inserted.

Happy Learning !!

Related Articles

References