In this pandas drop multiple columns article, I will explain how to remove/delete/drop multiple columns from DataFrame with examples. drop() method is used to remove columns and rows according to the specific column(label) names and indexes corresponding axis.
Note that drop() method by default returns a DataFrame(copy) after dropping specified columns. In case you wanted to remove columns in place then you should use inplace=True
.
1. Quick Examples of pandas Drop Multiple Columns
Below are some quick examples of how to drop multiple columns from pandas DataFrame.
# Drop multiple columns by Name
df.drop(["Courses", "Fee"], axis = 1, inplace=True)
# Drop multiple columns by Index
df.drop(df.columns[[1,2]], axis = 1, inplace=True)
# Drop multiple columns between two columns
df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)
# Drop multiple columns between two indexes
df.drop(df.iloc[:, 1:2], axis=1, inplace=True)
2. Pandas Drop Multiple Columns
In pandas Dataframe sometimes you would require to drop multiple columns by name, index, within the range of index and between two columns, all these could be done using drop() method.
drop() method removes the columns from the DataFrame, by default it doesn’t remove on the existing DataFrame instead it returns a new DataFrame without the columns specified with the drop method. In order to remove columns on the existing DataFrame object use inplace=True
param.
If a columns you wanted to remove is not present on the DataFrame it returns an error message and you can handle this error using errors
param.
You can also drop the index of the DataFrame using index
param.
Now, Let’s learn with examples. first, create a DataFrame with a dictionary of lists. On our DataFrame, we have columns Courses
, Fee
and Duration
.
import pandas as pd
technologies = ({
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
})
df = pd.DataFrame(technologies)
print(df)
Yields below output.
Courses Fee Duration
0 Spark 20000 30day
1 PySpark 25000 40days
2 Hadoop 26000 35days
3 Python 22000 40days
4 pandas 24000 60days
5 Oracle 21000 50days
6 Java 22000 55days
3. Drop Multiple Columns By Name
When you have a list of columns to drop, create a list object with the columns name and use it with drop() method or directly use the list. The Below examples delete columns Courses
and Fee
from Pandas DataFrame.
df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)
Yields below output. Use inplace=True
to update the self DataFrame.
Duration
0 30day
1 40days
2 35days
3 40days
4 60days
5 50days
6 55days
4. Drop Multiple Columns by Index
If you wanted to drop multiple columns by index, unfortunately, the drop() method doesn’t take an index as param, but we can overcome this by getting columns name by index using df.columns[]
. Use the below example to delete columns 0 and 1 (index starts from 0) index. for more examples, refer to remove multiple columns by index.
df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)
Yields same output as above.
5. Drop Columns from List
If you have a list of columns and you wanted to delete all columns from the list, use the below approach.
lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)
6. Remove Columns Between Specified Columns
Drop() method using loc[]
function to remove all columns between specific columns to another column’s name. Use [ : , 'Courses':'Fee']
to drop the one and second columns. inplace
option would work on the original object.
df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)
print(df)
7. Remove Columns Between Specified Indexes.
drop() method using iloc[]
function to remove all columns between specific columns. Use [: , 1:2]
for deleting the second column. For instance, df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
, removes Fee
column.
df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)
8. Complete Example For Reference
import pandas as pd
technologies = ({
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
})
df = pd.DataFrame(technologies)
print(df)
# Drop multiple columns by Name
df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)
# Drop multiple columns by Index
df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)
# Drop Columns from List
lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)
# Drop columns between two columns
df2=df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1)
print(df)
df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)
Conclusion
In this pandas drop multiple columns article, you have learned how to remove or delete multiple columns from DataFrame by name, label, index. Also, you have learned how to remove columns between two columns and many more examples.
Happy Learning !!
Related Articles
- How to Rename Column on Pandas DataFrame
- Drop a Rows from Pandas DataFrame
- Different Ways to Iterate Over Pandas DataFrame Rows
- Pandas Add Multiple Columns to DataFrame
- Pandas Filter DataFrame by Multiple Conditions
- Apply Multiple Filters to Pandas DataFrame or Series
- How to Create Pandas Pivot Multiple Columns
- Pandas GroupBy Multiple Columns Explained
- Pandas Read Multiple CSV Files into DataFrame
- Select pandas columns based on condition