Pandas DataFrame insert() Function

  • Post author:
  • Post category:Pandas / Python
  • Post last modified:December 24, 2022

The Pandas DataFrame.insert() function is used to insert a column into the DataFrame, this function updates the existing DataFrame with a column. In this article, I will explain how to use the pandas DataFrame insert() function and using this syntax & parameters how to insert a column at any position of the DataFrame with examples.

1. Quick Examples of DataFrame insert() Function

If you are in a hurry, below are some quick examples of pandas DataFrame insert() function.


# Below are quick examples

# Example 1: Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])

# Example 2: Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )

# Example 3: Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)

# Example 4: Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)

# Example 5: Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")

# Example 6: Insert a new column into the dataframe
df = pd.DataFrame(technologies,index=index_labels)
df.insert(4, 'Percent', '5%')

# Example 7: Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))

2. Syntax of DataFrame.insert()

Following is the syntax of DataFrame.insert() function.


# Syntax of DataFrame.insert()
DataFrame.insert(loc, column, value, allow_duplicates=_NoDefault.no_default)

2.1 Parameters of insert()

Following are the parameters of the pandas DataFrme insert() function.

  • loc – int: An integer which is specifying the location of a column where you want to insert a new column.
  • column – str, number, or hashable object: A string specifying the label of the inserted column.
  • value – Scalar, Series, or array-like: Any datatype value, which you are going to insert.
  • allow_duplicates – bool, optional, default lib.no_default: Its default value is False, it checks whether the inserted column with the same name already exists or not.

2.2 Return value of insert()

Returns a new DataFrame with a column added.

Now, Let’s create Pandas DataFrame using data from a Python dictionary, where the columns are CoursesFeeDuration and Discount.


import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

Yields below output.


# Output
    Courses    Fee Duration  Discount
r1    Spark  20000   30days      1000
r2  PySpark  25000   40days      2300
r3   Python  22000   35days      1200
r4   pandas  30000   50days      2000

3. Use DataFrame.insert() Function

You can use DataFrame.insert() function to insert a DataFrame column at any position of the existing DataFrame. Generally, any inserts happen at the end of the DataFrame but this function gives the flexibility to insert it at the beginning, in the middle, or at any column index of the DataFrame. This example adds a Tutors column at the beginning of the DataFrame.

Note that in pandas, the Index starts from zero. insert() function updates the existing DataFrame object with the column.


# Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])
print(df)

# Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )
print(df)

# Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)
print(df)

Yields below output.


# Output
     Tutors  Courses    Fee Duration  Discount
r1  William    Spark  20000   30days      1000
r2    Henry  PySpark  25000   40days      2300
r3  Michael   Python  22000   35days      1200
r4     John   pandas  30000   50days      2000

4. Insert New Multiple Columns into the DataFrame

By using DataFrame.insert() function you can also insert multiple columns to Pandas DataFrame at any position. Using this you can specify the index where you would like to insert multiple columns. The below example adds multiple columns at the first position (Index 0) and fifth position (Index 4).


# Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)
print(df)

Yields below output.


# Output
     Tutors  Courses    Fee Duration  Discount Percent
r1  William    Spark  20000   30days      1000      5%
r2    Henry  PySpark  25000   40days      2300      3%
r3  Michael   Python  22000   35days      1200      4%
r4     John   pandas  30000   50days      2000      2%

5. Insert an Empty Column into the DataFrame

Use DataFrame.insert() function to insert an empty column at any position on the pandas DataFrame. This adds a column inplace on the existing DataFrame object.


# Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")
print(df)

Yields below output.


# Output
   Blank_Column  Courses    Fee Duration  Discount
r1                 Spark  20000   30days      1000
r2               PySpark  25000   40days      2300
r3                Python  22000   35days      1200
r4                pandas  30000   50days      2000

6. Use DataFrame.insert() & pandas.series() Function

Similarly, follow the below function to insert DataFrame at any position of the existing DataFrame. This example inserts a Tutors column at the beginning of the DataFrame. Notice that pandas uses index alignment in case of value from type Series.


# Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))
print(df)

Yields below output.


# Output
     tutors  Courses    Fee Duration  Discount
r1  William    Spark  20000   30days      1000
r2    Henry  PySpark  25000   40days      2300
r3  Michael   Python  22000   35days      1200
r4     John   pandas  30000   50days      2000

7. Complete Example DataFrame insert() Function


import pandas as pd
import numpy as np
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

# Use DataFrame.insert() function
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'])
print(df)

# Add new column at the specific position
tutors = ['William', 'Henry', 'Michael', 'John']
df.insert(0,'Tutors', tutors )
print(df)

# Add column to specific position of dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], allow_duplicates=True)
print(df)

# Insert new multiple columns into the dataframe
df.insert(0, "Tutors", ['William', 'Henry', 'Michael', 'John'], True)
df.insert(5, "Percent", ['5%','3%','4%','2%'], True)
print(df)

# Insert an empty column into the dataframe
df.insert(0,"Blank_Column", " ")
print(df)

# Insert a new column into the dataframe
df = pd.DataFrame(technologies,index=index_labels)
df.insert(4, 'Percent', '5%')
print(df)

# Use DataFrame.insert() & pandas.series() Function
df.insert(0, "tutors", pd.Series(['William', 'Henry', 'Michael', 'John'], index=['r1','r2','r3','r4']))
print(df)

9. Conclusion

In this article, I have explained how to insert a column to the existing Pandas DataFrame by using DataFrame.insert() function. insert() is used to insert a column at any position of the DataFrame.

Happy Learning !!

Related Articles

References

Leave a Reply