Pandas Drop Columns from DataFrame

pandas.DataFrame.drop() method is used to remove the columns from the DataFrame, by default it doesn’t remove on the existing DataFrame instead it returns a new DataFrame after dropping the columns specified with the drop method. In order to remove columns on the existing DataFrame object use inplace=True param.

In this pandas drop columns article, I will explain how to drop columns, different columns, by name, by index, between two columns e.t.c. drop() method is used to remove columns and rows according to the specific column(label) name and corresponding axis.

Now, let’s see the drop() syntax and how to delete or drop columns (two or more) from DataFrame with examples.

1. Quick Examples of Drop Columns

Below are some quick examples of how to drop columns by name, by index e.t.c.


# Below are quick examples

# Drop single column by Name
df2=df.drop(["Fee"], axis = 1)
df2=df.drop(columns=["Fee"], axis = 1)
df2=df.drop(labels=["Fee"], axis = 1)

# Drop single column by Index
df2=df.drop(df.columns[1], axis = 1)

# Updates the DataFrame in place
df.drop(df.columns[1], axis = 1, inplace=True)

# Drop multiple columns
df.drop(["Courses", "Fee"], axis = 1, inplace=True)
df.drop(df.columns[[1,2]], axis = 1, inplace=True)

# Other ways to drop columns
df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)
df.drop(df.iloc[:, 1:2], axis=1, inplace=True)

2. pandas.DataFrame.drop() Syntax

Below is the syntax of pandas.DataFrame.drop() method.


# pandas DaraFrame drop() Syntax
DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')
  • labels – single label or list-like.
  • axis – Use 1 to drop columns and 0 to drop rows from DataFrame.
  • index – Column index to drop
  • columns – single label or list-like.
  • level – int or level name, optional, use for Multiindex.
  • inplace – Default False and returns a copy of DataFrame. When used True, it drop’s column inplace and returns None.
  • errors – {‘ignore’, ‘raise’}, default ‘raise’

Now, Let’s see a detailed example. first, create a pandas DataFrame with a dictionary of lists. On our DataFrame, we have column names Courses, Fee and Duration.


# Create DataFrame
import pandas as pd
technologies = ({
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
              })
df = pd.DataFrame(technologies)
print(df)

Yields below output.

pandas drop columns

3. Pandas Drop Columns

pandas drop() method removes the column by name and index from the DataFrame, by default it doesn’t remove from the existing DataFrame instead it returns a new DataFrame without the columns specified with the drop method. In order to remove columns on the existing DataFrame object use inplace=True param.

If a column you wanted to remove is not present on the DataFrame it returns an error message and you can handle this error using errors param.

You can also drop the index of the DataFrame using index param.

3.1 Drop Column by Name

This example removes a column by name Fee from a DataFrame. Note that to use axis=1 in order to delete columns.


# Drops 'Fee' column
df2=df.drop(["Fee"], axis = 1)
print(df2)

# Explicitly using parameter name 'labels'
df2=df.drop(labels=["Fee"], axis = 1)

# Alternatively you can also use columns instead of labels.
df2=df.drop(columns=["Fee"], axis = 1)

Yields below output. Use inplace=True to update the self DataFrame.


# Output:
   Courses Duration
0    Spark    30day
1  PySpark   40days
2   Hadoop   35days
3   Python   40days
4   pandas   60days
5   Oracle   50days
6     Java   55days

3.2 Drop Column by Index

In order to remove the DataFrame columns by Index, first, we should get the DataFrame column as a list by using df.columns and then pick the column by index. Note that the index starts from 0 in Python. On below example df.columns[1] represents the second column on DataFrame which is Fee.


# Drop column by index.
print(df.drop(df.columns[[1]], axis = 1))

# Using inplace=True
df.drop(df.columns[[1]], axis = 1, inplace=True)
print(df)

Yields the same output as above.

4. Drop Multiple Columns From DataFrame

Below are some examples of dropping multiple columns from DataFrame by column name and index.

4.1 Drop Two or More Columns By Label Name

When you have a list of column names to drop, create a list object with the column names and use it with drop() method or directly use the list. The Below examples delete columns Courses and Fee from DataFrame.


# Drop two or more columns
df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)

Yields below output. Use inplace=True to update the self DataFrame.


# Output:
  Duration
0    30day
1   40days
2   35days
3   40days
4   60days
5   50days
6   55days

4.2 Drop Two or More Columns by Index

If you wanted to drop two or more columns by index, unfortunately, the drop() method doesn’t take an index as param, but we can overcome this by getting column names by index using df.columns[]. Use the below example to delete columns 0 and 1 (index starts from 0) index.


# Drop two or more columns by index
df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)

Yields the same output as above.

4.3 Drop Columns from List of Columns

If you have a list of columns and you want to delete all columns from the list, use the below approach.


# Drop columns from list of columns
lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)

5. Other ways to Remove Columns from DataFrame

Above are the most used ways to remove/delete columns from DataFrame, below are some of the other ways to remove one or two columns.

5.1 Remove columns From DataFrame Inplace

In case you want to remove a column in place then you should use inplace=True. By using this on drop() function, returns None. Below example drops


# Remove columns using drop() with inplace
df.drop(df.columns[1], axis = 1, inplace=True)

5.2 Remove Columns from a List of Columns (iteratively) By Condition.

In one of the above examples, I have explained how to remove/delete columns from the list of columns. Now let’s see another example doing the same iteratively. This code removes Fee column.


# Remove columns using for loop
for col in df.columns:
    if 'Fee' in col:
        del df[col]
print(df)

5.3 Using df.loc[] to Remove Columns Between Specified Columns

Drop() method using loc[] function to remove all columns between a specific column name to another column’s name. Use [ : , 'Courses':'Fee'] to drop the one and second columns. inplace option would work on the original object.


# Remove columns using df.loc[] & drop()
df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)
print(df)

5.4 Using df.iloc[] to Remove Columns Between Specified Column Indexes.

drop() method using iloc[] function to remove all columns between a specific column to another column. Use [: , 1:2] for deleting the second column. For instance, df.drop(df.iloc[:, 1:2], inplace=True, axis=1), removes Fee column.


# Remove columns using df.iloc[] & drop()
df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)

6. Complete Example For Reference


import pandas as pd
technologies = ({
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
              })
df = pd.DataFrame(technologies)
print(df)

# Drop single column by Name
df2=df.drop(["Fee"], axis = 1)
print(df2)

df2=df.drop(columns=["Fee"], axis = 1)
print(df2)

df2=df.drop(labels=["Fee"], axis = 1)
print(df2)

# Drop column by index
df2=df.drop(df.columns[1], axis = 1)
print(df2)

# Drop multiple columns by Name
df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)

# Drop multiple columns by Index
df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)

# Drop Columns from List
lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)

# Drop columns between two columns
df2=df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1)
print(df)

df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)

# Drop columns by condition
for col in df.columns:
    if 'Fee' in col:
                del df[col]
print(df)
1. How do I drop columns from a Pandas DataFrame in Python?

A. You can drop columns from a Pandas DataFrame using the drop() method.

2. Can I drop multiple columns at once using the drop() method?

A. Yes, you can drop multiple columns at once by passing a list of column names to the columns parameter of the drop() function.

3. Is it possible to drop columns by index instead of column names?

A. Yes, you can drop columns by index using the drop() method by specifying the column indices instead of names.

4. Can I drop columns from a DataFrame by their position?

A. Yes, you can drop columns by their position using the iloc[] attribute.

Conclusion

In this pandas drop columns article, you have learned how to remove or delete a column, two or more columns from DataFrame by name, labels, and index. Also, you have learned how to remove between two columns and many more examples.

Happy Learning !!

References

Naveen (NNK)

Naveen (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ @ LinkedIn

Leave a Reply

You are currently viewing Pandas Drop Columns from DataFrame