In pandas, the pop()
method is used to remove a specified column from a DataFrame and return it as a Series. This can be useful if you need to work with a specific column separately from the DataFrame or if you want to modify the DataFrame by removing certain columns.
In this article, I will explain the Pandas DataFrame pop()
method by using its syntax, parameters, and usage, and how to return the specified item and remove it from the DataFrame. Raises a KeyError if the item is not found.
Key Points –
- The
pop()
method is used to remove a specified column from a DataFrame and return it as a Series. - This method directly modifies the original DataFrame by removing the specified column.
- The method takes a single parameter, the column name (as a string), that specifies which column to remove.
- If the specified column does not exist in the DataFrame,
pop()
raises aKeyError
. - The
pop()
method only removes one column at a time; it does not support popping multiple columns simultaneously.
Pandas DataFrame pop() Introduction
Let’s know the syntax of the pop() method.
# Syntax of Pandas DataFrame pop()
DataFrame.pop(item)
Parameters of the DataFrame pop()
It allows only one parameter.
item
– The label (i.e., column name) of the column to be removed. This is a single column name, provided as a string.
Return Value
The column that is removed from the DataFrame, is returned as a Series.
Usage of Pandas DataFrame pop() Method
The pop()
method in pandas removes the specified column from the DataFrame and returns it as a Series. If the column does not exist in the DataFrame, a KeyError
is raised.
To run some examples of the Pandas DataFrame pop() function, let’s create a Pandas DataFrame using data from a dictionary.
import pandas as pd
technologies= {
'Courses':["Spark", "PySpark", "Hadoop", "Pandas"],
'Fee' :[22000, 25000, 30000, 35000],
'Discount':[1000, 2300, 2500, 2200],
'Duration':['35days', '40days','50days', '45days']
}
df = pd.DataFrame(technologies)
print("Original DataFrame:\n", df)
Yields below output.
The pop()
method in pandas DataFrame is a convenient way to remove a column from the DataFrame and return it as a Series.
# Pop column 'Fee'
popped_column = df.pop('Fee')
print("Popped Column:\n", popped_column)
print("DataFrame after pop:\n", df)
Here,
- The initial DataFrame contains four columns:
Courses
,Fee
,Discount
, andDuration
. - The
pop()
method is used to remove theFee
column. This column is returned as a Series and stored in the variablepopped_column
. - The removed
Fee
column is displayed as a Series with its index and values. - The DataFrame after applying
pop()
no longer contains theFee
column, reflecting the change made by the method.
KeyError in DataFrame while using pop() Method
Alternatively, when using the pop()
method on a pandas DataFrame, a KeyError
will be raised if you attempt to pop a column that does not exist in the DataFrame. This error indicates that the specified column name is not found.
# Attempt to pop a non-existent column 'Price'
try:
popped_column = df.pop('Price')
print("Popped Column:\n", popped_column)
except KeyError as e:
print(f"KeyError: {e}")
# Output:
# KeyError: 'Price'
In the above example, when attempting to pop the Price
column, which does not exist in the DataFrame, a KeyError
is raised. The error is caught in the except
block, and an appropriate error message is printed.
Pop and Assign to a New Column
You can use the pop()
method to remove a column from a DataFrame and then assign that removed column to a new column in the same or another DataFrame.
# Pop the 'Discount' column and assign it
# To a new column 'DiscountedFee'
df['DiscountedFee'] = df.pop('Discount')
print("DataFrame after pop and assigning to a new column:\n", df)
Here,
- The
pop()
method is used to remove theDiscount
column. The returned Series (which is theDiscount
column) is then assigned to a new column namedDiscountedFee
. - The DataFrame now has a new column
DiscountedFee
, and theDiscount
column has been removed.
# Output:
# DataFrame after pop and assigning to a new column:
Courses Fee Duration DiscountedFee
0 Spark 22000 35days 1000
1 PySpark 25000 40days 2300
2 Hadoop 30000 50days 2500
3 Pandas 35000 45days 2200
Pop Multiple Columns Sequentially
Similarly, to pop multiple columns sequentially from a pandas DataFrame, you can call the pop()
method for each column you want to remove and handle them accordingly.
# Pop the 'Fee' column
fee_column = df.pop('Fee')
print("Popped 'Fee' Column:\n", fee_column)
# Pop the 'Discount' column
discount_column = df.pop('Discount')
print("Popped 'Discount' Column:\n", discount_column)
print("\nDataFrame after popping 'Fee' and 'Discount' columns:\n", df)
Here,
- The
pop()
method removes theFee
column, and it is returned as a Series, stored in the variablefee_column
. - Similarly, the
pop()
method removes theDiscount
column, and it is returned as a Series, stored in the variablediscount_column
. - The DataFrame after popping the
Fee
andDiscount
columns now only contains theCourses
andDuration
columns.
# Output:
# Popped 'Fee' Column:
0 22000
1 25000
2 30000
3 35000
Name: Fee, dtype: int64
# Popped 'Discount' Column:
0 1000
1 2300
2 2500
3 2200
Name: Discount, dtype: int64
# DataFrame after popping 'Fee' and 'Discount' columns:
Courses Duration
0 Spark 35days
1 PySpark 40days
2 Hadoop 50days
3 Pandas 45days
Frequently Asked Questions on Pandas DataFrame pop() Method
The pop()
method in a pandas DataFrame removes the specified column from the DataFrame and returns it as a Series.
To use the pop()
method, specify the name of the column you want to remove as an argument.
If you try to pop()
a column that does not exist in the DataFrame, a KeyError
will be raised. You can handle this error using a try-except
block.
You can assign the popped column to a new column in the same DataFrame. This can be useful if you want to rename a column or simply move it around within the DataFrame.
The pop()
method does modify the original DataFrame. When you use the pop()
method, it removes the specified column from the DataFrame and returns it as a Series. This change is made in place, meaning that the original DataFrame is directly altered and the specified column is no longer part of the DataFrame after the pop()
operation.
Conclusion
In conclusion, the pop()
method in pandas is a powerful and convenient tool for removing columns from a DataFrame and retrieving them as Series objects. It allows for in-place modification of the DataFrame while returning the specified column, making it useful for various data manipulation tasks.
Happy Learning!!
Related Articles
- Pandas DataFrame mode() Method
- Pandas DataFrame tail() Method
- Pandas DataFrame pivot() Method
- Pandas DataFrame explode() Method
- Pandas DataFrame nunique() Method
- Pandas DataFrame clip() Method
- Pandas DataFrame median() Method
- Pandas DataFrame div() Function
- Pandas DataFrame sum() Method
- Pandas DataFrame shift() Function
- Pandas DataFrame info() Function
- Pandas DataFrame head() Method