Pandas Drop Multiple Columns From DataFrame

In this pandas drop multiple columns article, I will explain how to remove/delete/drop multiple columns from DataFrame with examples. drop() method is used to remove columns and rows according to the specific column(label) names and indexes corresponding axis.

Note that drop() method by default returns a DataFrame(copy) after dropping specified columns. In case you wanted to remove columns in place then you should use inplace=True.

1. Quick Examples of pandas Drop Multiple Columns

Below are some quick examples of how to drop multiple columns from pandas DataFrame.



# Drop multiple columns by Name
df.drop(["Courses", "Fee"], axis = 1, inplace=True)

# Drop multiple columns by Index
df.drop(df.columns[[1,2]], axis = 1, inplace=True)

# Drop multiple columns between two columns
df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)

# Drop multiple columns between two indexes
df.drop(df.iloc[:, 1:2], axis=1, inplace=True)

2. Pandas Drop Multiple Columns

In pandas Dataframe sometimes you would require to drop multiple columns by name, index, within the range of index and between two columns, all these could be done using drop() method.

drop() method removes the columns from the DataFrame, by default it doesn’t remove on the existing DataFrame instead it returns a new DataFrame without the columns specified with the drop method. In order to remove columns on the existing DataFrame object use inplace=True param.

If a columns you wanted to remove is not present on the DataFrame it returns an error message and you can handle this error using errors param.

You can also drop the index of the DataFrame using index param.

Now, Let’s learn with examples. first, create a DataFrame with a dictionary of lists. On our DataFrame, we have columns Courses, Fee and Duration.


import pandas as pd
technologies = ({
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
              })
df = pd.DataFrame(technologies)
print(df)

Yields below output.


   Courses    Fee Duration
0    Spark  20000    30day
1  PySpark  25000   40days
2   Hadoop  26000   35days
3   Python  22000   40days
4   pandas  24000   60days
5   Oracle  21000   50days
6     Java  22000   55days

3. Drop Multiple Columns By Name

When you have a list of columns to drop, create a list object with the columns name and use it with drop() method or directly use the list. The Below examples delete columns Courses and Fee from Pandas DataFrame.


df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)

Yields below output. Use inplace=True to update the self DataFrame.


  Duration
0    30day
1   40days
2   35days
3   40days
4   60days
5   50days
6   55days

4. Drop Multiple Columns by Index

If you wanted to drop multiple columns by index, unfortunately, the drop() method doesn’t take an index as param, but we can overcome this by getting columns name by index using df.columns[]. Use the below example to delete columns 0 and 1 (index starts from 0) index. for more examples, refer to remove multiple columns by index.


df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)

Yields same output as above.

5. Drop Columns from List

If you have a list of columns and you wanted to delete all columns from the list, use the below approach.


lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)

6. Remove Columns Between Specified Columns

Drop() method using loc[] function to remove all columns between specific columns to another column’s name. Use [ : , 'Courses':'Fee'] to drop the one and second columns. inplace option would work on the original object.


df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)
print(df)

7. Remove Columns Between Specified Indexes.

drop() method using iloc[] function to remove all columns between specific columns. Use [: , 1:2] for deleting the second column. For instance, df.drop(df.iloc[:, 1:2], inplace=True, axis=1), removes Fee column.


df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)

8. Complete Example For Reference


import pandas as pd
technologies = ({
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
              })
df = pd.DataFrame(technologies)
print(df)

# Drop multiple columns by Name
df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)

# Drop multiple columns by Index
df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)

# Drop Columns from List
lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)

# Drop columns between two columns
df2=df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1)
print(df)

df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)

Conclusion

In this pandas drop multiple columns article, you have learned how to remove or delete multiple columns from DataFrame by name, label, index. Also, you have learned how to remove columns between two columns and many more examples.

Happy Learning !!

You May Also Like

References

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

You are currently viewing Pandas Drop Multiple Columns From DataFrame