Pandas – Drop Single & Multiple Columns From DataFrame

In this article, I will explain how to remove/delete/drop a single column and multiple (two or more) columns from Pandas DataFrame. drop() method is used to remove columns and rows according to the specific column(label) names and corresponding axis.

Note that drop() method by default returns a DataFrame(copy) after dropping specified columns. In case if you wanted to remove a column in place then you should use inplace=True.

Now, let’s see the drop() syntax and how to delete or drop one or multiple columns (two or more) from Pandas DataFrame with examples.

1. Pandas.DataFrame.drop() Syntax


# pandas DaraFrame drop() Syntax
DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
  • labels – single label or list-like.
  • axis – Use 1 to drop columns and 0 to drop rows from DataFrame.
  • index – Column index to drop
  • columns – single label or list-like.
  • level – int or level name, optional, use for Multiindex.
  • inplace – Default False and returns a copy of DataFrame. When used True, it drop’s column inplace and returns None.
  • errors – {‘ignore’, ‘raise’}, default ‘raise’

Below are some quick examples of using the drop() method.


# Drop single column by Name
df2=df.drop(["Fee"], axis = 1)
df2=df.drop(columns=["Fee"], axis = 1)
df2=df.drop(labels=["Fee"], axis = 1)

# Drop single column by Index
df2=df.drop(df.columns[1], axis = 1)

#Updates the DataFrame in place
df.drop(df.columns[1], axis = 1, inplace=True)

# Drop multiple columns
df.drop(["Courses", "Fee"], axis = 1, inplace=True)
df.drop(df.columns[[1,2]], axis = 1, inplace=True)

# Other ways to drop columns
df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)
df.drop(df.iloc[:, 1:2], axis=1, inplace=True)

2. Drop Column Explain with Examples

Now, Let’s see a detailed example. first, create a Pandas DataFrame with a dictionary of lists. On our pandas DataFrame, we have column names Courses, Fee and Duration.


import pandas as pd
technologies = ({
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
              })
df = pd.DataFrame(technologies)
print(df)

Yields below output.


   Courses    Fee Duration
0    Spark  20000    30day
1  PySpark  25000   40days
2   Hadoop  26000   35days
3   Python  22000   40days
4   pandas  24000   60days
5   Oracle  21000   50days
6     Java  22000   55days

3. Drop Single Column From Pandas DataFrame

You can delete a column from Pandas DataFrame by index or by column name (label).

3.1 Drop Column By Label Name

This example removes a Fee column from a DataFrame. Note that to use axis=1 in order to delete columns.


# Drops 'Fee' column
df2=df.drop(["Fee"], axis = 1)
print(df2)

# Explicitly using parameter name 'labels'
df2=df.drop(labels=["Fee"], axis = 1)

# Alternatively you can also use columns instead of labels.
df2=df.drop(columns=["Fee"], axis = 1)

Yields below output. Use inplace=True to update the self DataFrame.


   Courses Duration
0    Spark    30day
1  PySpark   40days
2   Hadoop   35days
3   Python   40days
4   pandas   60days
5   Oracle   50days
6     Java   55days

3.2 Drop Column by Index

In order to remove the column by Index, first, we should get the DataFrame column as a list by using df.columns and then pick the column by index. Note that the index starts from 0 in Python. On below example df.columns[1] represents the second column on DataFrame which is Fee.


# Drop column by index.
print(df.drop(df.columns[[1]], axis = 1))

# using inplace=True
#df.drop(df.columns[[1]], axis = 1, inplace=True)
#print(df)

Yields same output as above.

4. Drop Multiple Columns From Pandas DataFrame

Below are some examples of deleting multiple columns from Pandas DataFrame by column name and index.

4.1 Drop Multiple Columns By Label Name

The Below examples deletes columns Courses and Fee from Pandas DataFrame.


df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)

Yields below output. Use inplace=True to update the self DataFrame.


  Duration
0    30day
1   40days
2   35days
3   40days
4   60days
5   50days
6   55days

4.2 Drop Multiple Columns by Index

The below example deletes columns 0 and 1 (index starts from 0) index.


df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)

Yields same output as above.

4.3 Drop Columns from List of Columns

If you have a list of columns and you wanted to delete all columns from the list, use the below approach.


lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)

5. Other ways to Delete Columns from Pandas DataFrame

Above are the most used ways to delete columns from Pandas DataFrame, below are some of the other ways to delete one or multiple columns.

5.1 Delete columns From DataFrame inplace

In case if you wanted to remove a column in place then you should use inplace=True. By using this on drop() function, returns None. Below example drops


df.drop(df.columns[1], axis = 1, inplace=True)

5.2 Delete Columns from a List of Columns (iteratively) By Condition.

In one of the above examples, I have explained how to remove columns from the list of columns. Now let’s see another example doing the same iteratively. This code removes Fee column.


for col in df.columns:
    if 'Fee' in col:
        del df[col]
print(df)

5.3 Using df.loc() to Remove Columns Between Specified Columns

Drop() method using loc[] function to remove all columns between a specific column name to another column’s name. Use [ : , 'Courses':'Fee'] to drop the one and second columns.. inplace option would work on the original object.


df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)
print(df)

5.4 Using df.iloc() to Remove Columns Between Specified Column Indexes.

drop() method using iloc[] function to remove all columns between a specific column to another columns. Use [: , 1:2] for deleting the second column. For instance, df.drop(df.iloc[:, 1:2], inplace=True, axis=1), removes Fee column.


df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)

6. Complete Example For Reference


import pandas as pd
technologies = ({
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
              })
df = pd.DataFrame(technologies)
print(df)

# Drop single column by Name
df2=df.drop(["Fee"], axis = 1)
print(df2)

df2=df.drop(columns=["Fee"], axis = 1)
print(df2)

df2=df.drop(labels=["Fee"], axis = 1)
print(df2)

# Drop column by index
df2=df.drop(df.columns[1], axis = 1)
print(df2)

# Drop multiple columns by Name
df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)

# Drop multiple columns by Index
df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)

# Drop Columns from List
lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)

# Drop columns between two columns
df2=df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1)
print(df)

df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)

# Drop columns by condition
for col in df.columns:
    if 'Fee' in col:
                del df[col]
print(df)

Conclusion

In this article, you have learned how to remove or delete single and multiple columns from Pandas DataFrame by column name, labels, index. Also, you have learned how to remove columns between two columns.

Happy Learning !!

You May Also Like

References

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

Pandas – Drop Single & Multiple Columns From DataFrame