We can select columns based on single/multiple conditions using the pandas loc[]
attribute. The DataFrame.loc[] attribute property is used to select rows and columns based on index/index labels from DataFrame. Pandas DataFrame is a two-dimensional tabular data structure with labeled axes. i.e. columns and rows. Selecting columns from DataFrame results in a new DataFrame containing only specified selected columns from the original DataFrame.
In this article, I will explain how to select columns based on single/multiple conditions using the pandas loc[] attribute and other options.
1. Quick Examples of Pandas Select Columns by Condition
Below are some quick examples of select columns by condition
# Below are the quick examples.
# Example 1 : Pass boolean value into loc[] &
# Get specified column
df1 = df.loc[: , [True, False, True, False]]
# Example 2 : Select column based on condition
col = (df == 1200).any()
df = df.loc[: , col]
# Example 3: Select columns based on multiple condition
col = ((df == 25000 ) & (df =='Pandas' )).any()
d1f = df.loc[: , col]
# Example 4: # Get specified column using df[] notation
# Along with specified condition
print(df[df.columns[df.iloc[0] == '30days']])
pandas DataFrame loc key Points
- loc is used to select/filter rows and columns by labels.
- When using to select rows, you need to provide row indices label.
- It also provides a way to select rows and columns between ranges, every alternate e.t.c
pandas iloc[] is another property of DataFrame that is used to operate on column position and row indices. For a better understanding of these two learn the differences and similarities between pandas loc[] vs iloc[]
Now, let’s create a DataFrame with a few rows and columns and execute some examples of how to select columns based on conditions in pandas. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
.
# Create Pandas DataFrame
import pandas as pd
technologies = {
'Courses':["Spark","PySpark", "Pandas", "Python"],
'Fee' :[20000, 25000, 22000, 24000],
'Duration':['30days','40days', '35days', '45days'],
'Discount':[1000, 2300, 1200, 1500]
}
df = pd.DataFrame(technologies)
print(df)
# Output:
# Courses Fee Duration Discount
# 0 Spark 20000 30days 1000
# 1 PySpark 25000 40days 2300
# 2 Pandas 22000 35days 1200
# 3 Python 24000 45days 1500
Yields below output.

2. Select Pandas Columns based on Boolean Value
Using df.loc[]
we can get the rows or columns of DataFrame based on the index. Here, I will pass boolean values in the column section of loc[] attribute where boolean values should be taken the same size of columns of given DataFrame. This syntax will return corresponding columns of DataFrame for every boolean value of True
.
# Pass boolean value into loc[] & get specified column
df1 = df.loc[: , [True, False, True, False]]
print(df1)
# Output:
# Courses Duration
# 0 Spark 30days
# 1 PySpark 40days
# 2 Pandas 35days
# 3 Python 45days
3. Select Pandas Columns based on Single Conditions
We can get specified column/columns of a given Pandas DataFrame based on condition along with any() function and loc[] attribute. First, select a column using df == 1200
condition, it will return the same sized DataFrame where elements are boolean values. If the value is True
for the corresponding value of 1200 of the original DataFrame, otherwise it False
.
Then, call any()
function with Boolean DataFrame, and it will return the boolean Series where the values are all columns of DataFrame.
Finally, Pass the above boolean Series into the column section of the df.loc[] attribute, it will return the specified column of the given DataFrame.
# Select column based on condition
col = (df == 1200).any()
df = df.loc[: , col]
print(df)
# Output:
# Discount
# 0 1000
# 1 2300
# 2 1200
# 3 1500
4. Select Pandas Columns based on Multiple Conditions
Alternatively, Using the above syntax we can get specified columns based on multiple conditions. Let’s use and select the columns of the given DataFrame. For example,
# Select columns based on multiple condition
col = ((df == 25000 ) & (df =='Pandas' )).any()
d1f = df.loc[: , col]
print(df1)
# Output:
# Courses Duration
# 0 Spark 30days
# 1 PySpark 40days
# 2 Pandas 35days
# 3 Python 45days
5. Using df[] & Get Specified Column Based on Condition
When we want to select only columns based on some condition of a DataFrame, we can go with df[]
 notation, the best way to select the specified columns of DataFrame. df[df.columns[df.iloc[0] == '30days']]
using this syntax we can select the specified columns based on condition.
# Get specified column using df[] notation
# Along with specified condition
print(df[df.columns[df.iloc[0] == '30days']])
# Output:
# Duration
# 0 30days
# 1 40days
# 2 35days
# 3 45days
6. Conclusion
In this article, I have explained how to select columns based on single/multiple conditions using pandas loc[]
, iloc[]
attributes, and df[]
notation, with multiple examples.
Related Articles
- How to Get Pandas Columns Count
- How to Add Multiple Columns to Pandas DataFrame
- Pandas Split Column into Two Columns
- How to Create Pandas Pivot Multiple Columns
- Pandas GroupBy Multiple Columns Explained
- Pandas Correlation of Columns
- Pandas apply() Return Multiple Columns
- Pandas Concatenate Two Columns
- How to Rename Multiple Columns in Pandas
- Pandas Join DataFrames on Columns
- Pandas Drop Multiple Columns From DataFrame
- How to Drop Multiple Columns by Index in Pandas
- Pandas Check Column Contains a Value in DataFrame
- Pandas convert column to string-type
- Pandas Extract Column Value Based on Another Column
- Sort Multiple Columns in Pandas DataFrame
- Split Pandas DataFrame by column value
- Pandas Count Unique Values in Column
- Pandas Convert Row to Column Header in DataFrame