Pandas DataFrame.drop() method is used to remove the columns from the DataFrame, by default it doesn’t remove on the existing DataFrame instead it returns a new DataFrame after dropping the columns specified with the drop method. In order to remove columns on the existing DataFrame object use inplace=True
param.
In this pandas drop columns article, I will explain how to drop columns, different columns, by name, by index, between two columns, etc. drop()
method is used to remove columns and rows according to the specific column(label) name and corresponding axis.
Now, let’s see the drop()
syntax and how to delete or drop columns (two or more) from DataFrame with examples.
Key Points –
drop()
removes columns or rows based on labels by specifying the axis (1 for columns, 0 for rows).inplace=True
modifies the original DataFrame directly without returning a new one.- Columns can be dropped by name or index using
drop()
withlabels
ordf.columns[]
. - Dropping columns by position can be achieved using
iloc[]
or by indexingdf.columns[]
.Usingloc[]
, columns can be dropped within a range of names between two specified columns. - Column removal based on a condition can be performed iteratively, checking column values or names.
- The
drop()
method returns a new DataFrame by default unlessinplace=True
is used to modify the existing one.
1. Quick Examples of Drop Columns
If you are in a hurry, below are some quick examples of how to drop columns by name, by index e.t.c.
# Quick examples of drop columns
# Drop single column by Name
df2=df.drop(["Fee"], axis = 1)
df2=df.drop(columns=["Fee"], axis = 1)
df2=df.drop(labels=["Fee"], axis = 1)
# Drop single column by Index
df2=df.drop(df.columns[1], axis = 1)
# Updates the DataFrame in place
df.drop(df.columns[1], axis = 1, inplace=True)
# Drop multiple columns
df.drop(["Courses", "Fee"], axis = 1, inplace=True)
df.drop(df.columns[[1,2]], axis = 1, inplace=True)
# Other ways to drop columns
df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)
df.drop(df.iloc[:, 1:2], axis=1, inplace=True)
2. Syntax of pandas.DataFrame.drop()
Below is the syntax of pandas.DataFrame.drop() method.
# pandas DaraFrame drop() Syntax
DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
2.1 Parameters of the DataFrame.drop()
Following are the parameters of the drop() method.
labels
– Single label or list-like. It is the index or column labels to drop.axis
– {0 or ‘index’, 1 or ‘columns’}, default is 0. Specifies whether to drop labels from the index (0 or ‘index’) or columns (1 or ‘columns’).index
– An alternative to specifying the axis and labels. You can provide the index labels to drop whenaxis
is 0.columns
– An alternative to specifying the axis and labels. You can provide the column labels to drop whenaxis
is 1.level
– int or level name, optional, use for MultiIndex, level from which the labels will be removed.inplace
– If True, the DataFrame is modified in place, and the return value isNone
. If False (default), a new DataFrame with the specified columns or index labels removed is returned.errors
– {‘raise’, ‘ignore’}, default is ‘raise’. If ‘raise’, an error is raised for an invalid key. If ‘ignore’, any invalid key will be silently ignored.
Return Value
It returns the pandas.DataFrame.drop()
method returns a new DataFrame with the specified labels (rows or columns) removed. If the inplace
parameter is set to True, the method modifies the existing DataFrame in place and returns None.
Create Pandas DataFrame
Now, Let’s see a detailed example. first, create a pandas DataFrame with a dictionary of lists. On our DataFrame, we have column names Courses
, Fee
and Duration
.
# Create DataFrame
import pandas as pd
technologies = ({
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
})
df = pd.DataFrame(technologies)
print(df)
Yields below output.
3. Pandas Drop Columns
Pandas drop()
method removes the column by name and index from the DataFrame, by default it doesn’t remove from the existing DataFrame instead it returns a new DataFrame without the columns specified with the drop method. In order to remove columns on the existing DataFrame object use inplace=True
param.
If a column you wanted to remove is not present on the DataFrame it returns an error message and you can handle this error using errors
param.
You can also drop the index of the DataFrame using index
param.
3.1 Drop Column by Name
This example removes a column by name Fee
from a DataFrame. Note that to use axis=1
in order to delete columns.
# Drops 'Fee' column
df2=df.drop(["Fee"], axis = 1)
print(df2)
# Explicitly using parameter name 'labels'
df2=df.drop(labels=["Fee"], axis = 1)
# Alternatively you can also use columns instead of labels.
df2=df.drop(columns=["Fee"], axis = 1)
Yields below output. Use inplace=True
to update the self DataFrame.
# Output:
Courses Duration
0 Spark 30day
1 PySpark 40days
2 Hadoop 35days
3 Python 40days
4 pandas 60days
5 Oracle 50days
6 Java 55days
3.2 Drop Column by Index
In order to remove the DataFrame columns by Index, first, we should get the DataFrame column as a list by using df.columns
and then pick the column by index. Note that the index starts from 0 in Python. On below example df.columns[1]
represents the second column on DataFrame which is Fee
.
# Drop column by index
print(df.drop(df.columns[[1]], axis = 1))
# Using inplace=True
df.drop(df.columns[[1]], axis = 1, inplace=True)
print(df)
Yields the same output as above.
4. Drop Multiple Columns From DataFrame
Below are some examples of dropping multiple columns from DataFrame by column name and index.
4.1 Drop Two or More Columns By Label Name
When you have a list of column names to drop, create a list object with the column names and use it with drop()
method or directly use the list. The Below examples delete columns Courses
and Fee
from DataFrame.
# Drop two or more columns
df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)
Yields below output. Use inplace=True
to update the self DataFrame.
# Output:
Duration
0 30day
1 40days
2 35days
3 40days
4 60days
5 50days
6 55days
4.2 Drop Two or More Columns by Index
If you wanted to drop two or more columns by index, unfortunately, the drop()
method doesn’t take an index as param, but we can overcome this by getting column names by index using df.columns[]
. Use the below example to delete columns 0 and 1 (index starts from 0) index.
# Drop two or more columns by index
df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)
Yields the same output as above.
4.3 Drop Columns from List of Columns
If you have a list of columns and you want to delete all columns from the list, use the below approach.
# Drop columns from list of columns
lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)
5. Other ways to Remove Columns from DataFrame
Above are the most used ways to remove/delete columns from DataFrame, below are some of the other ways to remove one or two columns.
5.1 Remove columns From DataFrame Inplace
In case you want to remove a column in place then you should use inplace=True
. By using this on drop()
function, returns None. Below example drops
# Remove columns using drop() with inplace
df.drop(df.columns[1], axis = 1, inplace=True)
5.2 Remove Columns from a List of Columns (iteratively) By Condition.
In one of the above examples, I have explained how to remove/delete columns from the list of columns. Now let’s see another example doing the same iteratively. This code removes Fee
column.
# Remove columns using for loop
for col in df.columns:
if 'Fee' in col:
del df[col]
print(df)
5.3 Using df.loc[] to Remove Columns Between Specified Columns
Drop()
method using loc[]
function to remove all columns between a specific column name to another column’s name. Use [:, 'Courses':'Fee']
to drop the one and second columns. inplace
option would work on the original object.
# Remove columns using df.loc[] & drop()
df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)
print(df)
5.4 Using df.iloc[] to Remove Columns Between Specified Column Indexes.
drop()
method using iloc[]
function to remove all columns between a specific column to another column. Use [: , 1:2]
for deleting the second column. For instance, df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
, removes Fee
column.
# Remove columns using df.iloc[] & drop()
df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)
6. Complete Example For Reference
import pandas as pd
technologies = ({
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
})
df = pd.DataFrame(technologies)
print(df)
# Drop single column by Name
df2=df.drop(["Fee"], axis = 1)
print(df2)
df2=df.drop(columns=["Fee"], axis = 1)
print(df2)
df2=df.drop(labels=["Fee"], axis = 1)
print(df2)
# Drop column by index
df2=df.drop(df.columns[1], axis = 1)
print(df2)
# Drop multiple columns by Name
df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)
# Drop multiple columns by Index
df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)
# Drop Columns from List
lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)
# Drop columns between two columns
df2=df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1)
print(df)
df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)
# Drop columns by condition
for col in df.columns:
if 'Fee' in col:
del df[col]
print(df)
FAQ on Pandas Drop Columns from DataFrame
To drop multiple columns from a DataFrame in Pandas, you can use the drop
method and specify the list of column names you want to remove. For example, the columns_to_drop
list contains the names of the columns (‘B’ and ‘D’) that you want to remove. The drop
method is then used with the columns
parameter to drop these columns, and the result is stored back in the DataFrame df
.
You can drop multiple columns at once using the drop()
method in Pandas. To do this, you need to pass a list of column names or indices to the columns
parameter.
You can drop columns from a DataFrame by their position (index) using the iloc
indexer in Pandas. For example, df.columns[column_positions_to_drop]
retrieves the column names based on their positions, and then the drop
method is used to drop the specified columns.
To drop columns in place without creating a new DataFrame, you can use the inplace=True
parameter with the drop
method. For example, the inplace=True
parameter modifies the original DataFrame (df
) in place, and the specified columns are dropped without the need for reassignment.
You can drop columns based on their index in Pandas. To do this, you need to use the iloc
indexer along with the drop
method. For example, df.columns[column_indices_to_drop]
retrieves the column names based on their indices, and then the drop
method is used to drop the specified columns.
It is possible to drop columns based on a condition in Pandas. You can use boolean indexing to identify columns that meet a specific condition and then drop those columns.
Conclusion
In this pandas drop columns article, you have learned how to remove or delete a column, two or more columns from DataFrame by name, labels, and index. Also, you have learned how to remove between two columns and many more examples.
Happy Learning !!
Related Articles
- Drop a Rows from Pandas DataFrame
- Pandas Drop Rows With Condition
- Drop the last column from the DataFrame
- Pandas – Drop List of Rows From DataFrame
- Pandas Drop Rows From DataFrame Examples
- Pandas Drop Multiple Columns From DataFrame
- How to drop the Pandas column by index?
- Drop Pandas first column from DataFrame.
- Pandas Drop Columns with NaN or None Values
- Pandas Drop Multiple Columns From DataFrame
- How to Drop Duplicate Columns in Pandas DataFrame?
- Pandas Drop First/Last N Columns From DataFrame
- How to Drop Multiple Columns by Index in pandas