• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:23 mins read
You are currently viewing Pandas Drop Columns from DataFrame

Pandas DataFrame.drop() method is used to remove the columns from the DataFrame, by default it doesn’t remove on the existing DataFrame instead it returns a new DataFrame after dropping the columns specified with the drop method. In order to remove columns on the existing DataFrame object use inplace=True param.

In this pandas drop columns article, I will explain how to drop columns, different columns, by name, by index, between two columns, etc. drop() method is used to remove columns and rows according to the specific column(label) name and corresponding axis.

Now, let’s see the drop() syntax and how to delete or drop columns (two or more) from DataFrame with examples.

1. Quick Examples of Drop Columns

If you are in a hurry, below are some quick examples of how to drop columns by name, by index e.t.c.


# Quick examples of drop columns

# Drop single column by Name
df2=df.drop(["Fee"], axis = 1)
df2=df.drop(columns=["Fee"], axis = 1)
df2=df.drop(labels=["Fee"], axis = 1)

# Drop single column by Index
df2=df.drop(df.columns[1], axis = 1)

# Updates the DataFrame in place
df.drop(df.columns[1], axis = 1, inplace=True)

# Drop multiple columns
df.drop(["Courses", "Fee"], axis = 1, inplace=True)
df.drop(df.columns[[1,2]], axis = 1, inplace=True)

# Other ways to drop columns
df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)
df.drop(df.iloc[:, 1:2], axis=1, inplace=True)

2. Syntax of pandas.DataFrame.drop()

Below is the syntax of pandas.DataFrame.drop() method.


# pandas DaraFrame drop() Syntax
DataFrame.drop(labels=None, axis=0, index=None, columns=None, level=None, inplace=False, errors='raise')

2.1 Parameters of the DataFrame.drop()

Following are the parameters of the drop() method.

  • labels – Single label or list-like. It is the index or column labels to drop.
  • axis – {0 or ‘index’, 1 or ‘columns’}, default is 0. Specifies whether to drop labels from the index (0 or ‘index’) or columns (1 or ‘columns’).
  • index – An alternative to specifying the axis and labels. You can provide the index labels to drop when axis is 0.
  • columns – An alternative to specifying the axis and labels. You can provide the column labels to drop when axis is 1.
  • level – int or level name, optional, use for MultiIndex, level from which the labels will be removed.
  • inplace – If True, the DataFrame is modified in place, and the return value is None. If False (default), a new DataFrame with the specified columns or index labels removed is returned.
  • errors – {‘raise’, ‘ignore’}, default is ‘raise’. If ‘raise’, an error is raised for an invalid key. If ‘ignore’, any invalid key will be silently ignored.

Return Value

It returns the pandas.DataFrame.drop() method returns a new DataFrame with the specified labels (rows or columns) removed. If the inplace parameter is set to True, the method modifies the existing DataFrame in place and returns None.

Create Pandas DataFrame

Now, Let’s see a detailed example. first, create a pandas DataFrame with a dictionary of lists. On our DataFrame, we have column names Courses, Fee and Duration.


# Create DataFrame
import pandas as pd
technologies = ({
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
              })
df = pd.DataFrame(technologies)
print(df)

Yields below output.

pandas drop columns

3. Pandas Drop Columns

Pandas drop() method removes the column by name and index from the DataFrame, by default it doesn’t remove from the existing DataFrame instead it returns a new DataFrame without the columns specified with the drop method. In order to remove columns on the existing DataFrame object use inplace=True param.

If a column you wanted to remove is not present on the DataFrame it returns an error message and you can handle this error using errors param.

You can also drop the index of the DataFrame using index param.

3.1 Drop Column by Name

This example removes a column by name Fee from a DataFrame. Note that to use axis=1 in order to delete columns.


# Drops 'Fee' column
df2=df.drop(["Fee"], axis = 1)
print(df2)

# Explicitly using parameter name 'labels'
df2=df.drop(labels=["Fee"], axis = 1)

# Alternatively you can also use columns instead of labels.
df2=df.drop(columns=["Fee"], axis = 1)

Yields below output. Use inplace=True to update the self DataFrame.


# Output:
   Courses Duration
0    Spark    30day
1  PySpark   40days
2   Hadoop   35days
3   Python   40days
4   pandas   60days
5   Oracle   50days
6     Java   55days

3.2 Drop Column by Index

In order to remove the DataFrame columns by Index, first, we should get the DataFrame column as a list by using df.columns and then pick the column by index. Note that the index starts from 0 in Python. On below example df.columns[1] represents the second column on DataFrame which is Fee.


# Drop column by index.
print(df.drop(df.columns[[1]], axis = 1))

# Using inplace=True
df.drop(df.columns[[1]], axis = 1, inplace=True)
print(df)

Yields the same output as above.

4. Drop Multiple Columns From DataFrame

Below are some examples of dropping multiple columns from DataFrame by column name and index.

4.1 Drop Two or More Columns By Label Name

When you have a list of column names to drop, create a list object with the column names and use it with drop() method or directly use the list. The Below examples delete columns Courses and Fee from DataFrame.


# Drop two or more columns
df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)

Yields below output. Use inplace=True to update the self DataFrame.


# Output:
  Duration
0    30day
1   40days
2   35days
3   40days
4   60days
5   50days
6   55days

4.2 Drop Two or More Columns by Index

If you wanted to drop two or more columns by index, unfortunately, the drop() method doesn’t take an index as param, but we can overcome this by getting column names by index using df.columns[]. Use the below example to delete columns 0 and 1 (index starts from 0) index.


# Drop two or more columns by index
df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)

Yields the same output as above.

4.3 Drop Columns from List of Columns

If you have a list of columns and you want to delete all columns from the list, use the below approach.


# Drop columns from list of columns
lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)

5. Other ways to Remove Columns from DataFrame

Above are the most used ways to remove/delete columns from DataFrame, below are some of the other ways to remove one or two columns.

5.1 Remove columns From DataFrame Inplace

In case you want to remove a column in place then you should use inplace=True. By using this on drop() function, returns None. Below example drops


# Remove columns using drop() with inplace
df.drop(df.columns[1], axis = 1, inplace=True)

5.2 Remove Columns from a List of Columns (iteratively) By Condition.

In one of the above examples, I have explained how to remove/delete columns from the list of columns. Now let’s see another example doing the same iteratively. This code removes Fee column.


# Remove columns using for loop
for col in df.columns:
    if 'Fee' in col:
        del df[col]
print(df)

5.3 Using df.loc[] to Remove Columns Between Specified Columns

Drop() method using loc[] function to remove all columns between a specific column name to another column’s name. Use [:, 'Courses':'Fee'] to drop the one and second columns. inplace option would work on the original object.


# Remove columns using df.loc[] & drop()
df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1, inplace=True)
print(df)

5.4 Using df.iloc[] to Remove Columns Between Specified Column Indexes.

drop() method using iloc[] function to remove all columns between a specific column to another column. Use [: , 1:2] for deleting the second column. For instance, df.drop(df.iloc[:, 1:2], inplace=True, axis=1), removes Fee column.


# Remove columns using df.iloc[] & drop()
df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)

6. Complete Example For Reference


import pandas as pd
technologies = ({
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
              })
df = pd.DataFrame(technologies)
print(df)

# Drop single column by Name
df2=df.drop(["Fee"], axis = 1)
print(df2)

df2=df.drop(columns=["Fee"], axis = 1)
print(df2)

df2=df.drop(labels=["Fee"], axis = 1)
print(df2)

# Drop column by index
df2=df.drop(df.columns[1], axis = 1)
print(df2)

# Drop multiple columns by Name
df2=df.drop(["Courses", "Fee"], axis = 1)
print(df2)

# Drop multiple columns by Index
df2=df.drop(df.columns[[0,1]], axis = 1)
print(df2)

# Drop Columns from List
lisCol = ["Courses","Fee"]
df2=df.drop(lisCol, axis = 1)
print(df2)

# Drop columns between two columns
df2=df.drop(df.loc[:, 'Courses':'Fee'].columns, axis = 1)
print(df)

df.drop(df.iloc[:, 1:2], inplace=True, axis=1)
print(df)

# Drop columns by condition
for col in df.columns:
    if 'Fee' in col:
                del df[col]
print(df)

Frequently Asked Questions

How can I drop multiple columns from a DataFrame in Pandas?

To drop multiple columns from a DataFrame in Pandas, you can use the drop method and specify the list of column names you want to remove. For example, the columns_to_drop list contains the names of the columns (‘B’ and ‘D’) that you want to remove. The drop method is then used with the columns parameter to drop these columns, and the result is stored back in the DataFrame df.

Can I drop multiple columns at once using the drop() method?

You can drop multiple columns at once using the drop() method in Pandas. To do this, you need to pass a list of column names or indices to the columns parameter.

Can I drop columns from a DataFrame by their position?

You can drop columns from a DataFrame by their position (index) using the iloc indexer in Pandas. For example, df.columns[column_positions_to_drop] retrieves the column names based on their positions, and then the drop method is used to drop the specified columns.

How can I drop columns in place without creating a new DataFrame?

To drop columns in place without creating a new DataFrame, you can use the inplace=True parameter with the drop method. For example, the inplace=True parameter modifies the original DataFrame (df) in place, and the specified columns are dropped without the need for reassignment.

Can I drop columns based on their index instead of names?

You can drop columns based on their index in Pandas. To do this, you need to use the iloc indexer along with the drop method. For example, df.columns[column_indices_to_drop] retrieves the column names based on their indices, and then the drop method is used to drop the specified columns.

Is it possible to drop columns based on a condition in Pandas?

It is possible to drop columns based on a condition in Pandas. You can use boolean indexing to identify columns that meet a specific condition and then drop those columns.

Conclusion

In this pandas drop columns article, you have learned how to remove or delete a column, two or more columns from DataFrame by name, labels, and index. Also, you have learned how to remove between two columns and many more examples.

Happy Learning !!

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

Leave a Reply