Pandas – Select All Columns Except One Column

In this article, I will explain how to select all columns except one column in Pandas DataFrame. DataFrame is basically a two-dimension series object. They have columns and rows with columns representing the content and rows representing the index. while processing table like structured data some times we may need to select all columns by ignoring one or more columns. Let’s see how to select all columns by ignoring one column in DataFrame with examples.

1. Quick Examples of Select All Columns Except One Column in pandas

If you are in a hurry, below are some quick examples of how to select all columns except one column in pandas DataFrame.


# Below are quick examples
# Using .loc[] to all columns except Duration column
df2 = df.loc[:, df.columns != "Duration"]
print(df2)

# Using drop() method to selet all except Discount column
df2 = df.drop("Discount" ,axis= 1)
print(df2)

# Drop multiple columns
df2 = df.drop(['Fee', 'Discount'], axis=1)
print(df2)

# Using series.difference() to select all except Fee column
df2 = df[df.columns.difference(["Fee"])]
print(df2)

# Using isin() method
df2 = df.loc[:, ~df.columns.isin(['Fee'])]
print(df2)

Using isin() method to drop multiple columns
df2 = df.loc[:, ~df.columns.isin(['Fee','Discount'])]
print(df2)

Now, let’s create a DataFrame with a few rows and columns and execute some examples and validate the results. Our DataFrame contains column names Courses, Fee, Duration, and Discount.


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
    'Fee' :[20000,25000,26000,22000,24000,21000,22000],
    'Duration':['30day','40days','35days', '40days','60days','50days','55days'],
    'Discount':[1000,2300,1500,1200,2500,2100,2000]
              }
df = pd.DataFrame(technologies)
print(df)

Yields below output.


   Courses    Fee Duration  Discount
0    Spark  20000    30day      1000
1  PySpark  25000   40days      2300
2   Hadoop  26000   35days      1500
3   Python  22000   40days      1200
4   pandas  24000   60days      2500
5   Oracle  21000   50days      2100
6     Java  22000   55days      2000

2. Select All Except One Column Using .loc[] in pandas

Using pandas.DataFrame.loc[] property you can select all columns you want and exclude one you don’t want. for example df.loc[:,df.columns] selects all columns and df.loc[:,df.columns != 'Duration'] ignores Duration column from the selection. Note that df.columns returns a pandas Series.


# Using .loc[] to select all columns except Duration column
df2 = df.loc[:, df.columns != "Duration"]
print(df2)

Yields below output.


   Courses    Fee  Discount
0    Spark  20000      1000
1  PySpark  25000      2300
2   Hadoop  26000      1500
3   Python  22000      1200
4   pandas  24000      2500
5   Oracle  21000      2100
6     Java  22000      2000

3. Select All Except One Column Using drop() Method in pandas

You can also acheive selecting all columns except one column by deleting the unwanted column using drop() method. Note that drop() is also used to drop rows from pandas DataFrame.

In order to remove columns use axis=1 or columns param. For example df.drop("Discount",axis=1) removes Discount column by kepping all other columns untouched. This gives you a DataFrame with all columns with out one unwanted column.


# Using drop() method to selet all except Discount column
df2 = df.drop("Discount" ,axis= 1)
print(df2)

Yields below output.


   Courses    Fee Duration
0    Spark  20000    30day
1  PySpark  25000   40days
2   Hadoop  26000   35days
3   Python  22000   40days
4   pandas  24000   60days
5   Oracle  21000   50days
6     Java  22000   55days

4. Using drop() Method to Remove Multiple Columns

In case if you wanted to drop multiple columns use df.drop()method with list of column names you wanted to delete.


# Drop multiple columns
df2 = df.drop(['Fee', 'Discount'], axis=1)
print(df2)

Yields below output.


   Courses Duration
0    Spark    30day
1  PySpark   40days
2   Hadoop   35days
3   Python   40days
4   pandas   60days
5   Oracle   50days
6     Java   55days

5. Using Series.difference() Method to Select All Columns Except One

You can use list of columns you don’t wante to select to Series.difference() Method. For E.x, df[df.columns.difference(["Fee"])] select all columns, except one “Fee” column in Dataframe.


# Using series.difference() to select all except Fee column
df2 = df[df.columns.difference(["Fee"])]
print(df2)

Yields below output.


   Courses  Discount Duration
0    Spark      1000    30day
1  PySpark      2300   40days
2   Hadoop      1500   35days
3   Python      1200   40days
4   pandas      2500   60days
5   Oracle      2100   50days
6     Java      2000   55days

6. Using df.columns.isin() Method

You can also try using isin() method with negate operator for example : df.loc[:, ~df.columns.isin(['Fee'])] This returns a DataFrame with all columsn except Fee column.


# Using isin() method
df2 = df.loc[:, ~df.columns.isin(['Fee'])]
print(df2)

Yields below output.


   Courses Duration  Discount
0    Spark    30day      1000
1  PySpark   40days      2300
2   Hadoop   35days      1500
3   Python   40days      1200
4   pandas   60days      2500
5   Oracle   50days      2100
6     Java   55days      2000

Conclusion

In this article, you have learned how to select all columns except one column in Pandas DataFrame using DataFrame.loc[], DataFrame.drop(), Series.difference(), DataFrame.columns.isin() methods with examples.

Happy Learning !!

Also Read

References

NNK

SparkByExamples.com is a Big Data and Spark examples community page, all examples are simple and easy to understand and well tested in our development environment Read more ..

Leave a Reply

Pandas – Select All Columns Except One Column