• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:14 mins read
You are currently viewing Pandas – Drop List of Rows From DataFrame

By using pandas.DataFrame.drop() method you can remove/delete/drop the list of rows from pandas, all you need to provide is a list of rows indexes or labels as a param to this method. By default drop() method removes the rows and returns a copy of the updated DataFrame instead of replacing the existing referring DataFrame. If you want to remove from the DataFrame in place use inplace=True param. By default, the value for inplace the property is False meaning not to update the existing DataFrame.

By using the same drop() method, you can also remove columns in pandas DataFrame by using axis=1.

Key Points –

  • Use the drop() method in Pandas DataFrame to eliminate specific rows by index label or index position.
  • Specify the index labels or positions of the rows to drop along with the axis parameter set to 0.
  • To drop rows based on index labels, pass the labels as a list to the labels parameter.
  • For dropping rows based on index positions, utilize the index parameter and pass the positions as a list.
  • Ensure to set the inplace parameter to True if you want the changes to be reflected in the original DataFrame.
  • It’s crucial to specify whether to drop rows based on index labels or positions, utilizing appropriate parameters such as labels or index.

1. Create a Sample DataFrame

Let’s create a pandas DataFrame to explain how to remove the list of rows with examples, my DataFrame contains the column names Courses, Fee, Duration, and Discount.


# Create a Sample DataFrame
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day','40days','35days', '40days','60days','50days','55days'],
    'Discount':[1000,2300,1500,1200,2500,2100,2000]
               }
df = pd.DataFrame(technologies)
print("DataFrame:\n", df)

Yields below output.

Pandas Drop List Rows

2. pandas Drop List of Rows by Index

pandas.DataFrame.drop() function takes a list of indexes/labels that you wanted to delete, this function returns a copy of DataFrame without modifying the reference DataFrame.

Related: In Pandas, you can drop rows from DataFrame by index position.


# Drop by List of Index position
df1 = df.drop([2,4])
print("After drooping list of rows:\n", df1)

Yields below output.

Pandas Drop List Rows

If you want to create a new index after removing the list of rows.


# Reset index
df = pd.DataFrame(technologies)
df1 = df.drop([2,4]).reset_index()
print(df1)

4. Drop a List of Rows From DataFrame inplace

If you notice by default drop() method returns the copy of the DataFrame after removing rows, but if you want to update the existing DataFrame, use inplace=True the parameter. when you use inplace=True param, DataFrame returns None instead of DataFrame. For E.x df.drop([3,5], inplace=True) drops the specified list of rows from df an object.


# Drop a List of Rows From DataFrame inplace
df = pd.DataFrame(technologies)
df.drop([2,4],inplace=True)
print(df)

Yields the same output as above.

5. Removing Range of Rows From One to Five

You can use Python list slicing to delete a list of rows from 1 to 5 for example, df.drop(df.index[:5],inplace= True) function remove one to five rows.


# Removing Range of Rows From One to Five
df = pd.DataFrame(technologies)
df.drop(df.index[:5],inplace=True)
print(df)

Yields below output.


# Output:
Removing Range of Rows from 0 to 5
  Courses    Fee Duration  Discount
5  Oracle  21000   50days      2100
6    Java  22000   55days      2000

6. Drop List of Rows by Index Position in DataFrame

In case if you have index labels on DataFrame, you can also remove the list of rows by position. For example, below removes 3rd and 5th records from DataFrame. Note index starts from zero.


# Drop List of Rows by Index Position in DataFrame
indexes=['r1','r2','r3','r4','r5','r6','r7']
df = pd.DataFrame(technologies,index=indexes)
df.drop(df.index[[2,4]],inplace=True)
print(df)

Yields below output.


# Output:
    Courses    Fee Duration  Discount
r1    Spark  20000    30day      1000
r2  PySpark  25000   40days      2300
r4   Python  22000   40days      1200
r6   Oracle  21000   50days      2100
r7     Java  22000   55days      2000

You can also try removing the rows by applying pandas filter.

Frequently Asked Questions on

How does the drop() method work in Pandas DataFrame?

The drop() method in Pandas DataFrame is used to remove rows or columns from the DataFrame based on specified index labels or positions. By default, it removes rows, but you can specify the axis parameter to remove columns instead.

Can I drop multiple rows at once using drop()?

You can drop multiple rows at once using the drop() method in Pandas. You can achieve this by passing a list of index labels or positions corresponding to the rows you want to drop.

What parameters are important when dropping rows from a DataFrame?

Parameters such as labels or index are crucial for specifying which rows to drop based on either index labels or positions. Additionally, the axis parameter is essential to indicate that rows (axis=0) are being targeted for removal.

How can I ensure the changes made by drop() are applied to the original DataFrame?

Set the inplace parameter to True when calling the drop() method. This ensures that modifications are made directly to the original DataFrame rather than creating a new one.

Is it possible to drop rows based on a condition rather than specific index labels or positions?

You can drop rows based on conditions using methods like drop() combined with boolean indexing or the loc[] accessor to filter rows based on specific criteria before dropping them.

Conclusion

In this article, you have learned how to remove a list of DataFrame rows in pandas using the drop() function, also learned how to remove rows by a list of indexes and labels.

Happy Learning !!

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium