How to check if a single column or multiple columns exists in pandas DataFrame? You can use Dataframe.columns
attribute that returns the column labels as a list from pandas DataFrame and use it with pandas if condition to check. In this article, I will explain several ways how to check If a column exists in pandas DataFrame with examples.
1. Quick Examples of Check If a Column Exists in Pandas DataFrame
In case if you hurry, below are some quick examples of how to check if a column exists in pandas DataFrame.
# Below are quick example
# Check if column Courses is in DataFrame.columns
if 'Courses' in df.columns:
print("Courses column is present : Yes")
else:
print("Courses column is present : No")
# Check if column Courses is in DataFrame
if 'Courses' in df:
print("Courses column is present : Yes")
else:
print("Courses column is present : No")
# Check if column Courses is not in DataFrame.columns
if 'Courses' not in df.columns:
print("Courses column is present : Yes")
else:
print("Courses column is present : No")
# Check for multiple columns all exist Using set.issubset
if set(['Courses','Duration']).issubset(df.columns):
print("Courses column is present : Yes")
else:
print("Courses column is present : No")
# By using curly braces to issubset DataFrame.coluns
if {'Courses','Duration'}.issubset(df.columns):
print("Courses column is present : Yes")
else:
print("Courses column is present : No")
# To check if one or more columns all exist in DataFrame
if all([item in df.columns for item in ['Fee','Discount']]):
print("Courses column is present : Yes")
else:
print("Courses column is present : No")
Now, let’s create a DataFrame with a few rows and columns, execute these examples and validate results. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
.
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
Yields below output.
Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r2 PySpark 25000 40days 2300
r3 Python 22000 35days 1200
r4 pandas 30000 50days 2000
2. Check If Single Column Exists in DataFrame
Use DataFrame columns with if condition to check if a column exists. Let’s see if a "Courses"
column exists in pandas DataFrame. DataFrame.columns return a list of all column labels.
# Check if column Courses is in DataFrame.columns
if 'Courses' in df.columns:
print("Courses column is present : Yes")
else:
print("Courses column is present : No")
Yields below output.
Courses column is present : Yes
Alternatively, you can also write it as
# Check if column Courses is in DataFrame
if 'Courses' in df:
print("Courses column is present : Yes")
else:
print("Courses column is present : No")
3. Check If a Column Not Exists in DataFrame
To check whether the "XYZ"
column exists in DataFrame or not, use not in operator. For Example, if 'XYZ' not in df.columns:
method.
# Check if column Courses is not in DataFrame.columns
if 'XYZ' not in df.columns:
print("XYZ column is present : NO")
else:
print("XYZ column is present : Yes")
Yields below output.
XYZ column is present : NO
4. Check for Multiple Columns Exists in Pandas DataFrame
In order to check if a list of multiple selected columns exist in pandas DataFrame, use set.issubset
. For Example, if set(['Courses','Duration']).issubset(df.columns):
method.
# Check for multiple columns all exist Using set.issubset
if set(['Courses','Duration']).issubset(df.columns):
print("Columns is present : Yes")
else:
print("Columns is present : No")
Yields below output.
Columns is present : Yes
To set([])
can alternatively be constructed with curly braces.
# By using curly braces to issubset DataFrame.coluns
if {'Courses','Duration'}.issubset(df.columns):
print("Column is present : Yes")
else:
print("Column is present : No")
Yields same output as above.
5. To Check If One or More Columns All Exist in DataFrame
To check if one or more columns exist in pandas DataFrame, use a list comprehension, as in: For instance, if all([item in df.columns for item in ['Fee','Discount']]):
.
# To check if one or more columns all exist in DataFrame
if all([item in df.columns for item in ['Fee','Discount']]):
print("Column is present : Yes")
else:
print("Column is present : No")
Yields same output as above.
6. Complete Example For Check If a Column Exists in DataFrame
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
# Check if column Courses is in DataFrame.columns
if 'Courses' in df.columns:
print("Courses column is present Yes")
else:
print("Courses column is not present No")
# Check if column Courses is in DataFrame
if 'Courses' in df:
print("Courses column is present Yes")
else:
print("Courses column is not present No")
# Check if column Courses is not in DataFrame.columns
if 'Courses' not in df.columns:
print("Courses column is present Yes")
else:
print("Courses column is not present No")
# Check for multiple columns all exist Using set.issubset
if set(['Courses','Duration']).issubset(df.columns):
print("Courses column is present Yes")
else:
print("Courses column is not present No")
# By using curly braces to issubset DataFrame.coluns
if {'Courses','Duration'}.issubset(df.columns):
print("Courses column is present Yes")
else:
print("Courses column is not present No")
# To check if one or more columns all exist in DataFrame
if all([item in df.columns for item in ['Fee','Discount']]):
print("Courses column is present Yes")
else:
print("Courses column is not present No")
Conclusion
In this article, you have learned how to check If a column exists in DataFrame and if a column does not exist by using the list and set methods of if conditions. You can get all DataFrame column labels by using DataFrame.columns
.
Happy Learning !!
Related Articles
- Create Pandas DataFrame With Working Examples
- Get Column Average or Mean in Pandas DataFrame
- Pandas Remove Columns & Index | Writing CSV File
- Pandas Drop First/Last N Columns From DataFrame
- How to Delete Last Row From Pandas DataFrame
- Pandas – Retrieve Number of Columns From DataFrame
- Pandas – Retrieve Number of Rows From DataFrame
- .Count NaN Values in Pandas DataFrame