• Post author:
  • Post category:Pandas
  • Post last modified:July 31, 2024
  • Reading time:16 mins read
You are currently viewing Pandas DataFrame pop() Method

In pandas, the pop() method is used to remove a specified column from a DataFrame and return it as a Series. This can be useful if you need to work with a specific column separately from the DataFrame or if you want to modify the DataFrame by removing certain columns.

Advertisements

In this article, I will explain the Pandas DataFrame pop() method by using its syntax, parameters, and usage, and how to return the specified item and remove it from the DataFrame. Raises a KeyError if the item is not found.

Key Points –

  • The pop() method is used to remove a specified column from a DataFrame and return it as a Series.
  • This method directly modifies the original DataFrame by removing the specified column.
  • The method takes a single parameter, the column name (as a string), that specifies which column to remove.
  • If the specified column does not exist in the DataFrame, pop() raises a KeyError.
  • The pop() method only removes one column at a time; it does not support popping multiple columns simultaneously.

Pandas DataFrame pop() Introduction

Let’s know the syntax of the pop() method.


# Syntax of Pandas DataFrame pop()
DataFrame.pop(item)

Parameters of the DataFrame pop()

It allows only one parameter.

  • item – The label (i.e., column name) of the column to be removed. This is a single column name, provided as a string.

Return Value

The column that is removed from the DataFrame, is returned as a Series.

Usage of Pandas DataFrame pop() Method

The pop() method in pandas removes the specified column from the DataFrame and returns it as a Series. If the column does not exist in the DataFrame, a KeyError is raised.

To run some examples of the Pandas DataFrame pop() function, let’s create a Pandas DataFrame using data from a dictionary.


import pandas as pd

technologies= {
    'Courses':["Spark", "PySpark", "Hadoop", "Pandas"],
    'Fee' :[22000, 25000, 30000, 35000],
    'Discount':[1000, 2300, 2500, 2200],
    'Duration':['35days', '40days','50days', '45days']
          }

df = pd.DataFrame(technologies)
print("Original DataFrame:\n", df)

Yields below output.

pandas pop

The pop() method in pandas DataFrame is a convenient way to remove a column from the DataFrame and return it as a Series.


# Pop column 'Fee'
popped_column = df.pop('Fee')
print("Popped Column:\n", popped_column)
print("DataFrame after pop:\n", df)

Here,

  • The initial DataFrame contains four columns: Courses, Fee, Discount, and Duration.
  • The pop() method is used to remove the Fee column. This column is returned as a Series and stored in the variable popped_column.
  • The removed Fee column is displayed as a Series with its index and values.
  • The DataFrame after applying pop() no longer contains the Fee column, reflecting the change made by the method.
pandas pop

KeyError in DataFrame while using pop() Method

Alternatively, when using the pop() method on a pandas DataFrame, a KeyError will be raised if you attempt to pop a column that does not exist in the DataFrame. This error indicates that the specified column name is not found.


# Attempt to pop a non-existent column 'Price'
try:
    popped_column = df.pop('Price')
    print("Popped Column:\n", popped_column)
except KeyError as e:
    print(f"KeyError: {e}")
    
# Output:
# KeyError: 'Price'

In the above example, when attempting to pop the Price column, which does not exist in the DataFrame, a KeyError is raised. The error is caught in the except block, and an appropriate error message is printed.

Pop and Assign to a New Column

You can use the pop() method to remove a column from a DataFrame and then assign that removed column to a new column in the same or another DataFrame.


# Pop the 'Discount' column and assign it 
# To a new column 'DiscountedFee'
df['DiscountedFee'] = df.pop('Discount')
print("DataFrame after pop and assigning to a new column:\n", df)

Here,

  • The pop() method is used to remove the Discount column. The returned Series (which is the Discount column) is then assigned to a new column named DiscountedFee.
  • The DataFrame now has a new column DiscountedFee, and the Discount column has been removed.

# Output:
# DataFrame after pop and assigning to a new column:
    Courses    Fee Duration  DiscountedFee
0    Spark  22000   35days           1000
1  PySpark  25000   40days           2300
2   Hadoop  30000   50days           2500
3   Pandas  35000   45days           2200

Pop Multiple Columns Sequentially

Similarly, to pop multiple columns sequentially from a pandas DataFrame, you can call the pop() method for each column you want to remove and handle them accordingly.


# Pop the 'Fee' column
fee_column = df.pop('Fee')
print("Popped 'Fee' Column:\n", fee_column)

# Pop the 'Discount' column
discount_column = df.pop('Discount')
print("Popped 'Discount' Column:\n", discount_column)
print("\nDataFrame after popping 'Fee' and 'Discount' columns:\n", df)

Here,

  • The pop() method removes the Fee column, and it is returned as a Series, stored in the variable fee_column.
  • Similarly, the pop() method removes the Discount column, and it is returned as a Series, stored in the variable discount_column.
  • The DataFrame after popping the Fee and Discount columns now only contains the Courses and Duration columns.

# Output:
# Popped 'Fee' Column:
 0    22000
1    25000
2    30000
3    35000
Name: Fee, dtype: int64

# Popped 'Discount' Column:
 0    1000
1    2300
2    2500
3    2200
Name: Discount, dtype: int64

# DataFrame after popping 'Fee' and 'Discount' columns:
    Courses Duration
0    Spark   35days
1  PySpark   40days
2   Hadoop   50days
3   Pandas   45days

Frequently Asked Questions on Pandas DataFrame pop() Method

What does the pop() method do in a pandas DataFrame?

The pop() method in a pandas DataFrame removes the specified column from the DataFrame and returns it as a Series.

How do I use the pop() method to remove a column?

To use the pop() method, specify the name of the column you want to remove as an argument.

What happens if I try to pop() a column that does not exist in the DataFrame?

If you try to pop() a column that does not exist in the DataFrame, a KeyError will be raised. You can handle this error using a try-except block.

Can I assign the popped column to a new column in the same DataFrame?

You can assign the popped column to a new column in the same DataFrame. This can be useful if you want to rename a column or simply move it around within the DataFrame.

Does the pop() method modify the original DataFrame?

The pop() method does modify the original DataFrame. When you use the pop() method, it removes the specified column from the DataFrame and returns it as a Series. This change is made in place, meaning that the original DataFrame is directly altered and the specified column is no longer part of the DataFrame after the pop() operation.

Conclusion

In conclusion, the pop() method in pandas is a powerful and convenient tool for removing columns from a DataFrame and retrieving them as Series objects. It allows for in-place modification of the DataFrame while returning the specified column, making it useful for various data manipulation tasks.

Happy Learning!!

Reference