By using pandas.DataFrame.drop()
method you can drop/remove/delete rows from DataFrame. axis
param is used to specify what axis you would like to remove. By default axis=0
meaning to remove rows. Use axis=1
or columns
param to remove columns. By default, Pandas return a copy DataFrame after deleting rows, used inpalce=True
to remove from existing referring DataFrame.
Related: Drop DataFrame Rows by Checking Conditions
In this article, I will cover how to remove rows by labels, indexes, and ranges and how to drop inplace
and None
, Nan
& Null
values with examples. if you have duplicate rows, use drop_duplicates() to drop duplicate rows from pandas DataFrame
Key Points –
- Use the
drop()
method to remove rows by specifying the row labels or indices. - Set the
axis
parameter to0
(or omit it) to indicate that rows should be dropped. - Use the
inplace
parameter to modify the original DataFrame directly without creating a new one. - After dropping rows, consider resetting the index with
reset_index()
to maintain sequential indexing. - Set the errors parameter to ‘ignore’ to suppress errors when attempting to drop non-existent row labels.
- Leverage the
query()
method to filter and drop rows based on complex conditions.
Pandas.DataFrame.drop() Syntax – Drop Rows & Columns
Let’s know the syntax of the DataFrame drop() function.
# Pandas DaraFrame drop() Syntax
DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
Parameters
labels
– Single label or list-like. It’s used with axis param.axis
– Default sets to 0. 1 to drop columns and 0 to drop rows.index
– Use to specify rows. Accepts single label or list-like.columns
– Use to specify columns. Accepts single label or list-like.level
– int or level name, optional, use for Multiindex.inplace
– DefaultFalse
, returns a copy of DataFrame. When usedTrue
, it drops the column inplace (current DataFrame) and returnsNone
.errors
– {‘ignore’, ‘raise’}, default ‘raise’.
Let’s create a DataFrame, run some examples, and explore the output. Note that our DataFrame contains index labels for rows which I am going to use to demonstrate removing rows by labels.
# Create a DataFrame
import pandas as pd
import numpy as np
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python"],
'Fee' :[20000,25000,26000,22000],
'Duration':['30day','40days',np.nan, None],
'Discount':[1000,2300,1500,1200]
}
indexes=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=indexes)
print(df)
Yields below output.
Pandas Drop Rows From DataFrame Examples
By default drop() method removes rows (axis=0
) from DataFrame. Let’s see several examples of how to remove rows from DataFrame.
Drop rows by Index Labels or Names
One of the Panda’s advantages is you can assign labels/names to rows, similar to column names. If you have DataFrame with row labels (index labels), you can specify what rows you want to remove by label names.
# Drop rows by Index Label
df = pd.DataFrame(technologies,index=indexes)
df1 = df.drop(['r1','r2'])
print("Drop rows from DataFrame:\n", df1)
Yields below output.
Alternatively, you can also write the same statement by using the field name 'index'
.
# Delete Rows by Index Labels
df1 = df.drop(index=['r1','r2'])
And by using labels
and axis
as below.
# Delete Rows by Index Labels & axis
df1 = df.drop(labels=['r1','r2'])
df1 = df.drop(labels=['r1','r2'],axis=0)
Notes:
- As you see using
labels, axis=0
is equivalent to usingindex=label names
. axis=0
mean rows. By defaultdrop()
method considersaxis=0
hence you don’t have to specify to remove rows. to remove columns explicitly specifyaxis=1
orcolumns
.
Drop Rows by Index Number (Row Number)
Similarly by using drop()
method you can also remove rows by index position from pandas DataFrame. drop() method doesn’t have a position index as a param, hence we need to get the row labels from the index and pass these to the drop method. We will use df.index
it to get row labels for the indexes we want to delete.
df.index.values
returns all row labels as a list.df.index[[1,3]]
gets you row labels for the 2nd and 3rd rows, bypassing these to drop() method removes these rows. Note that in Python, the list index starts from zero.
# Delete Rows by Index numbers
df = pd.DataFrame(technologies,index=indexes)
df1=df.drop(df.index[[1,3]])
print(df1)
Yields the same output as section 2.1. In order to drop the first row, you can use df.drop(df.index[0])
, and to drop the last row use df.drop(df.index[-1])
.
# Removes First Row
df=df.drop(df.index[0])
# Removes Last Row
df=df.drop(df.index[-1])
Delete Rows by Index Range
You can also remove rows by specifying the index range. The below example removes all rows starting 3rd row.
# Delete Rows by Index Range
df = pd.DataFrame(technologies,index=indexes)
df1=df.drop(df.index[2:])
print(df1)
Yields below output.
# Output:
Courses Fee Duration Discount
r1 Spark 20000 30day 1000
r2 PySpark 25000 40days 2300
Delete Rows when you have Default Index
By default, pandas assign a sequence number to all rows also called index, row index starts from zero and increments by 1 for every row. If you are not using custom index labels, pandas DataFrame assigns sequence numbers as Index. To remove rows with the default index, you can try below.
# Remove rows when you have default index.
df = pd.DataFrame(technologies)
df1 = df.drop(0)
df3 = df.drop([0, 3])
df4 = df.drop(range(0,2))
Note that df.drop(-1)
doesn’t remove the last row as the -1 index is not present in DataFrame. You can still use df.drop(df.index[-1])
it to remove the last row.
Remove DataFrame Rows Inplace
All examples you have seen above return a copy of DataFrame after removing rows. In case if you want to remove rows inplace from referring DataFrame use inplace=True
. By default, inplace param is set to False
.
# Delete Rows inplace
df = pd.DataFrame(technologies,index=indexes)
df.drop(['r1','r2'],inplace=True)
print(df)
Drop Rows by Checking Conditions
Most of the time we would also need to remove DataFrame rows based on some conditions (column value), you can do this by using loc[] and iloc[] methods.
# Delete Rows by Checking Conditions
df = pd.DataFrame(technologies)
df1 = df.loc[df["Discount"] >=1500 ]
print(df1)
Yields below output.
# Output:
Courses Fee Duration Discount
1 PySpark 25000 40days 2300
2 Hadoop 26000 NaN 1500
Drop Rows that NaN/None/Null Values
While working with analytics you would often be required to clean up the data that has None
, Null
& np.NaN
values. By using df.dropna()
you can remove NaN values from DataFrame.
# Delete rows with Nan, None & Null Values
df = pd.DataFrame(technologies,index=indexes)
df2=df.dropna()
print(df2)
This removes all rows that have None, Null & NaN values on any columns.
# Output:
Courses Fee Duration Discount
r1 Spark 20000 30day 1000
r2 PySpark 25000 40days 2300
Remove Rows by Slicing DataFrame
You can also drop a list of DataFrame rows by slicing. Remember index starts from zero.
# Remove Rows by Slicing DataFrame
df2=df[4:] # Returns rows from 4th row
df2=df[1:-1] # Removes first and last row
df2=df[2:4] # Return rows between 2 and 4
Related: You can also remove first N rows from pandas DataFrame and remove last N Rows from pands DataFrame
FAQ on Drop Rows From Pandas DataFrame
You can use the dropna()
method to remove rows containing missing values (NaN).
You can use the drop()
method with the index labels you want to remove.
You can use boolean indexing to filter rows based on a condition and create a new DataFrame without the rows that don’t meet the condition.
You can use the drop_duplicates()
method to remove duplicate rows based on the values in one or more columns.
You can use the drop()
method with a custom condition or function to drop rows based on your specific criteria
Conclusion
In this pandas drop rows article you have learned how to drop/remove pandas DataFrame rows using drop()
method. By default drop()
deletes rows (axis=0
), if you want to delete columns either you have to use axis=1
or columns=labels
param.
Happy Learning !!
Related Articles
- Drop Pandas rows with condition
- How to drop rows by index position?
- Delete Last Row From Pandas DataFrame
- Pandas – Drop List of Rows From DataFrame
- Pandas Drop Last N Rows From DataFrame
- Pandas Drop First N Rows From DataFrame
- How to drop duplicate rows from DataFrame?
- How to drop first row from the Pandas DataFrame
- Pandas – Drop First Three Rows From DataFrame
- Pandas Drop Rows with NaN Values in DataFrame