How to Filter Pandas DataFrame by multiple conditions? By using df[],
loc[]
, query()
, eval()
and numpy.where()
we can filter Pandas DataFrame by multiple conditions. The process of applying multiple filter conditions in Pandas DataFrame is one of the most frequently performed tasks while manipulating data. Pandas provide several techniques to retrieve subsets of data from the DataFrame efficiently.
In this article, I will explain how to filter pandas DataFrame multiple conditions using DataFrame.loc[]
, DataFrame.query()
, df[]
, DataFrame.eval()
, and numpy.where()
functions.
1. Quick Examples of Filter DataFrame Multiple Conditions
If you are in a hurry, below are some quick examples of how to filter DataFrame by multiple conditions.
# Below are the quick examples
# Example 1: Use DataFrame.loc[] to filter by multiple conditions
df2 = df.loc[(df['Fee']>=24000) &
(df['Discount']< 2000) &
(df['Courses'].str.startswith('P')),
['Courses','Duration']]
# Example 2: Use DataFrame.query() to filter by multiple conditions
df2 = df.query('Fee >= 22000 &
Discount < 3000 &
Courses.str.startswith("H").values')
# Example 3: Use Pandas Boolean Indexing df[] to Filter by Multiple Conditions
df2 = df[(df['Fee']>=22000) &
(df['Discount']< 3000) &
df['Courses'].str.startswith('P')][['Courses','Fee', 'Discount']]
# Example 4: Use DataFrame.eval() to filter by multiple conditions
df2 =df[df.eval("Fee <= 25000 &
(Discount < 2300) &
Courses.str.startswith('S').values")]
# Example 5: Use numpy.where() to filter by multiple conditions
df2 = np.where((df['Fee']>=22000) &
(df['Discount']< 3000) &
(df['Courses'].str.startswith('P')))
df3 = df.loc[df2])
Now, Let’s create Pandas DataFrame using data from a Python dictionary, where the columns are Courses
, Fee
, Duration
and Discount
.
import pandas as pd
import numpy as np
technologies= ({
'Courses':["Spark","Pyspark","Hadoop","Pandas"],
'Fee' :[22000,25000,24000,26000],
'Duration':['30days','50days','40days','60days'],
'Discount':[1000,2300,2500,1400]
})
df = pd.DataFrame(technologies)
print(df)
Yields below output.

2. Pandas Filter by Multiple Conditions
To filter Pandas DataFrame by multiple conditions use DataFrame.loc[] property along with the conditions. Make sure you surround each condition with a bracket. Here, we will get all rows having Fee
greater or equal to 24000
and Discount
is less than 2000
and their Courses
start with ‘P’
from the DataFrame. Print the details with Courses
and their Duration
.
# Use DataFrame.loc[] to filter by multiple conditions
df2 = df.loc[(df['Fee']>=24000) & (df['Discount']< 2000) & (df['Courses'].str.startswith('P')),
['Courses','Duration']]
print(df2)
Yields below output.
# Output:
Courses Duration
3 Pandas 60days
3. Use DataFrame.query() to Filter by Multiple Conditions
DataFrame.query() function can also be used to filter by multiple conditions in pandas. In the below example, we have specified some conditions, such as the Fee>=22000
and Discount<3000
, and the first letter in the column Courses must start with H. After applying the expression, it returns a new DataFrame.
# Use DataFrame.query() to filter by multiple conditions
df2 = df.query('Fee >= 22000 & Discount < 3000 & Courses.str.startswith("H").values')
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount
2 Hadoop 24000 40days 2500
4. Filter by Multiple Conditions using df[] Notation
Similarly, we get all rows having Fee>=22000
and Discount <3000
, and their Courses start with ‘P’
from the DataFrame. In order to select the subset of data using the values in the DataFrame and applying Boolean conditions.
# Use Pandas Boolean Indexing df[] to Filter by Multiple Conditions
df2 = df[(df['Fee']>=22000) & (df['Discount']< 3000) & df['Courses'].str.startswith('P')][['Courses','Fee', 'Discount']]
print(df2)
Yields below output.
# Output:
Courses Fee Discount
1 Pyspark 25000 2300
3 Pandas 26000 1400
5. Use DataFrame.eval() to Filter by Multiple Conditions
The eval()
function is used to evaluate a string describing operations on DataFrame columns which can be used to filter Pandas DataFrame by multiple conditions. It operates on columns only, not specific rows or elements. Inside the parentheses of the eval()
function, we have specified two conditions with AND operators between them.
This function has extracted one row that satisfies the condition. Here, we get all rows having Fee<=25000
and Discount<2300
, and their Courses
start with ‘S’
from the DataFrame. Similar to the query()
function.
# Use DataFrame.eval() to filter by multiple conditions
df2 =df[df.eval("Fee <= 25000 & (Discount < 2300) & Courses.str.startswith('S').values")]
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 22000 30days 1000
6. Use NumPy.where() to Filter by Multiple Conditions
Alternatively, we can also use numpy.where() function to filter pandas DataFrame by specified multiple conditions. we will get all rows having Fee
greater or equal to 22000
and Discount
is less than 3000,
and the first character of the column Courses must start with the letter P
.
The np.where()
function has returned an array with two index values, 1
and 3
. We can use the loc[]
function to retrieve the rows.
# Use numpy.where() to filter by multiple conditions
df2 = np.where((df['Fee']>=22000) & (df['Discount']< 3000) & (df['Courses'].str.startswith('P')))
print(df2)
df3 = df.loc[df2])
print(df3)
Yields below output.
# Output:
(array([1, 3], dtype=int64),)
Courses Fee Duration Discount
1 Pyspark 25000 50days 2300
3 Pandas 26000 60days 1400
7. Complete Example Filter DataFrame by Multiple Conditions
import pandas as pd
import numpy as np
technologies= ({
'Courses':["Spark","Pyspark","Hadoop","Pandas"],
'Fee' :[22000,25000,24000,26000],
'Duration':['30days','50days','40days','60days'],
'Discount':[1000,2300,2500,1400]
})
df = pd.DataFrame(technologies)
print(df)
# Use DataFrame.loc[] to filter by multiple conditions
df2 = df.loc[(df['Fee']>=24000) & (df['Discount']< 2000) & (df['Courses'].str.startswith('P')),
['Courses','Duration']]
print(df2)
# Use DataFrame.query() to filter by multiple conditions
df2 = df.query('Fee >= 22000 & Discount < 3000 & Courses.str.startswith("H").values')
print(df2)
# Use Pandas Boolean Indexing df[] to Filter by Multiple Conditions
df2 = df[(df['Fee']>=22000) & (df['Discount']< 3000) & df['Courses'].str.startswith('P')][['Courses','Fee', 'Discount']]
print(df2)
# Use DataFrame.eval() to filter by multiple conditions
df2 =df[df.eval("Fee <= 25000 & (Discount < 2300) & Courses.str.startswith('S').values")]
print(df2)
# Use numpy.where() to filter by multiple conditions
df2 = np.where((df['Fee']>=22000) & (df['Discount']< 3000) & (df['Courses'].str.startswith('P')))
print(df2)
df3 = df.loc[df2])
print(df3)
8. Conclusion
In this article, I have explained how to filter pandas DataFrame with multiple conditions by using DataFrame.loc[]
, DataFrame.query()
, df[]
, DataFrame.eval()
, and numpy.where()
function with several examples.
Happy Learning !!
Related Articles
- Pandas Series filter() Function
- How to Use NOT IN Filter in Pandas
- Pandas Filter DataFrame Rows on Dates
- Pandas Filter Rows Using IN Like SQL
- pandas DataFrame filter() – Usage & Examples
- Pandas Filter DataFrame by Substring criteria
- Apply Multiple Filters to Pandas DataFrame or Series
- Pandas Operator Chaining to Filter DataFrame Rows
- Pandas Filter Rows with NAN Value from DataFrame Column
- Pandas filter by column value