In this article, I will explain how to select all columns except one column in Pandas DataFrame. DataFrame is basically a two-dimension series object. They have columns and rows with columns representing the content and rows representing the index. while processing table like structured data some times we may need to select all columns by ignoring one or more columns. Let’s see how to select all columns by ignoring one column in DataFrame with examples.
1. Quick Examples of Select All Columns Except One Column in pandas
If you are in a hurry, below are some quick examples of how to select all columns except one column in pandas DataFrame.
# Below are quick examples
# Using .loc[] to all columns except Duration column
df2 = df.loc[:, df.columns != "Duration"]
print(df2)
# Using drop() method to selet all except Discount column
df2 = df.drop("Discount" ,axis= 1)
print(df2)
# Drop multiple columns
df2 = df.drop(['Fee', 'Discount'], axis=1)
print(df2)
# Using series.difference() to select all except Fee column
df2 = df[df.columns.difference(["Fee"])]
print(df2)
# Using isin() method
df2 = df.loc[:, ~df.columns.isin(['Fee'])]
print(df2)
# Using isin() method to drop multiple columns
df2 = df.loc[:, ~df.columns.isin(['Fee','Discount'])]
print(df2)
Now, let’s create a DataFrame with a few rows and columns and execute some examples and validate the results. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
.
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30day','40days','35days', '40days','60days','50days','55days'],
'Discount':[1000,2300,1500,1200,2500,2100,2000]
}
df = pd.DataFrame(technologies)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 20000 30day 1000
1 PySpark 25000 40days 2300
2 Hadoop 26000 35days 1500
3 Python 22000 40days 1200
4 pandas 24000 60days 2500
5 Oracle 21000 50days 2100
6 Java 22000 55days 2000
2. Select All Except One Column Using .loc[] in pandas
Using pandas.DataFrame.loc[] property you can select all columns you want and exclude one you don’t want. for example df.loc[:,df.columns]
selects all columns and df.loc[:,df.columns != 'Duration']
ignores Duration
column from the selection. Note that df.columns returns a pandas Series.
# Using .loc[] to select all columns except Duration column
df2 = df.loc[:, df.columns != "Duration"]
print(df2)
Yields below output.
# Output:
Courses Fee Discount
0 Spark 20000 1000
1 PySpark 25000 2300
2 Hadoop 26000 1500
3 Python 22000 1200
4 pandas 24000 2500
5 Oracle 21000 2100
6 Java 22000 2000
3. Select All Except One Column Using drop() Method in pandas
You can also acheive selecting all columns except one column by deleting the unwanted column using drop() method. Note that drop() is also used to drop rows from pandas DataFrame.
In order to remove columns use axis=1
or columns param. For example df.drop("Discount",axis=1)
removes Discount column by kepping all other columns untouched. This gives you a DataFrame with all columns with out one unwanted column.
# Using drop() method to selet all except Discount column
df2 = df.drop("Discount" ,axis= 1)
print(df2)
Yields below output.
# Output:
Courses Fee Duration
0 Spark 20000 30day
1 PySpark 25000 40days
2 Hadoop 26000 35days
3 Python 22000 40days
4 pandas 24000 60days
5 Oracle 21000 50days
6 Java 22000 55days
4. Using drop() Method to Remove Multiple Columns
In case if you wanted to drop multiple columns use df.drop()
method with list of column names you wanted to delete.
# Drop multiple columns
df2 = df.drop(['Fee', 'Discount'], axis=1)
print(df2)
Yields below output.
# Output:
Courses Duration
0 Spark 30day
1 PySpark 40days
2 Hadoop 35days
3 Python 40days
4 pandas 60days
5 Oracle 50days
6 Java 55days
5. Using Series.difference() Method to Select All Columns Except One
You can use list of columns you don’t wante to select to Series.difference()
Method. For E.x, df[df.columns.difference(["Fee"])]
select all columns, except one “Fee
” column in Dataframe.
# Using series.difference() to select all except Fee column
df2 = df[df.columns.difference(["Fee"])]
print(df2)
Yields below output.
# Output:
Courses Discount Duration
0 Spark 1000 30day
1 PySpark 2300 40days
2 Hadoop 1500 35days
3 Python 1200 40days
4 pandas 2500 60days
5 Oracle 2100 50days
6 Java 2000 55days
6. Using df.columns.isin() Method
You can also try using isin() method with negate operator for example : df.loc[:, ~df.columns.isin(['Fee'])]
This returns a DataFrame with all columsn except Fee
column.
# Using isin() method
df2 = df.loc[:, ~df.columns.isin(['Fee'])]
print(df2)
Yields below output.
# Output:
Courses Duration Discount
0 Spark 30day 1000
1 PySpark 40days 2300
2 Hadoop 35days 1500
3 Python 40days 1200
4 pandas 60days 2500
5 Oracle 50days 2100
6 Java 55days 2000
Conclusion
In this article, you have learned how to select all columns except one column in Pandas DataFrame using DataFrame.loc[]
, DataFrame.drop()
, Series.difference(), DataFrame.columns.isin()
methods with examples.
Happy Learning !!
Related Articles
- Select Multiple Columns in Pandas DataFrame
- Pandas Rename Column | Multiple Columns
- Remap Values in Column with a Dictionary (Dict) in Pandas
- Install pandas on Windows Step-by-Step
- Add New Column to Existing Pandas DataFrame
- Pandas Series.replace() – Replace Values
- How to Replace String in pandas DataFrame
- Pandas Replace substring in DataFrame