You can filter the Rows from pandas DataFrame based on a single condition or multiple conditions either using DataFrame.loc[] attribute, DataFrame.query(), or DataFrame.apply() method. In this article, I will explain how to filter rows by condition(s) with several examples.
Related:
- pandas.DataFrame.filter() – To filter rows by index and columns by name.
- pandas.DataFrame.loc[] – To select rows by indices label and column by name.
- pandas.DataFrame.iloc[] – To select rows by index and column by position.
- pandas.DataFrame.apply() – To custom select using lambda function.
1. Quick Examples of Filter Rows by Condition(s)
If you are in a hurry, below are some quick examples of how to filter DataFrame rows by single conditions and multiple conditions.
# Filter Rows Based on condition
df[df["Courses"] == 'Spark']
df.loc[df['Courses'] == value]
df.query("Courses == 'Spark'")
df.loc[df['Courses'] != 'Spark']
df.loc[df['Courses'].isin(values)]
df.loc[~df['Courses'].isin(values)]
# Filter Multiple Conditions using Multiple Columns
df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
df.loc[(df['Discount'] >= 1200) & (df['Fee'] >= 23000 )]
# Using lambda function
df.apply(lambda row: row[df['Courses'].isin(['Spark','PySpark'])])
# Filter columns that have no None & nana values
df.dropna()
# Other examples
df[df['Courses'].str.contains("Spark")]
df[df['Courses'].str.lower().str.contains("spark")]
df[df['Courses'].str.startswith("P")]
If you are a learner, let’s see with sample data and then run through these examples, and explore the output to understand better. First, let’s create a pandas DataFrame from Dictionary.
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
'Fee' :[22000,25000,23000,24000,26000],
'Duration':['30days','50days','30days', None,np.nan],
'Discount':[1000,2300,1000,1200,2500]
}
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
Yields below output.

Note that the above DataFrame also contains None
and NaN
values on Duration
column that I would be using in my examples below to filter rows that have None & Nan values and filter.
2. Filter Rows by Condition
You can use df[df["Courses"] == 'Spark']
to filter rows by a condition in pandas DataFrame. Not that this expression returns a new DataFrame with selected rows.
# Filter Rows by Condition
df2=df[df["Courses"] == 'Spark']
print("After filtering the rows:\n", df2)
Yields below output.

You can also write the above statement with a variable.
value="Spark"
df2=df[df["Courses"] == value]
If you want to filter based on value not equals then use !=
operator. df[df["Courses"] != 'Spark']
df[df["Courses"] != 'Spark']
3. Using DataFrame.query()
Using query() method you can filter pandas DataFrame rows using an expression, below is a simple example. You can use query() pretty much to run any example explained in this article.
df2=df.query("Courses == 'Spark'")
print(df2)
Yields the same output as above. You can also try other examples explained above with this approach.
4. Using DataFrame.loc[] to Filter by Condition
By using DataFrame.loc[]
.
# Using DataFrame.loc to Filter by Condition
df2=df.loc[df['Courses'] == "Spark"]
print(df2)
Yields the same output as above. You can also try other examples explained above with this approach.
5. Filter Rows by List of Values
If you have values in a list and want to filter the rows based on the list of values use isin()
method.
# Filter Rows Based on List of Values
values=["Spark","PySpark"]
print(df[df["Courses"].isin(values)] )
# Using df.loc
print(df.loc[df['Courses'].isin(values)])
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 22000 30days 1000
1 PySpark 25000 50days 2300
Filter rows by negating conditions can be done using ~ operator
df2=df.loc[~df['Courses'].isin(values)]
print(df2)
6. pandas Filter Rows by Multiple Conditions
Most of the time we would need to filter the rows based on multiple conditions applying on multiple columns, you can do that in Pandas as below. Note that the parentheses are needed for each condition expression due to Python’s operator precedence rules. &
operator binds more tightly than <=
and >=
. not using parenthesis will have unexpected results.
# Filter Rows based on multiple conditions
print(df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)])
print(df[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)])
print(df.query("Discount >= 1000 & Discount <= 2000"))
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 22000 30days 1000
2 Hadoop 23000 30days 1000
3 Python 24000 None 1200
You can also use multiple columns to filter Pandas DataFrame rows.
df2=df.loc[(df['Discount'] >= 1200) & (df['Fee'] >= 23000 )]
print(df2)
Yields below output
# Output:
Courses Fee Duration Discount
1 PySpark 25000 50days 2300
3 Python 24000 None 1200
4 Pandas 26000 NaN 2500
7. Using DataFrame.apply() & Lambda Function
DataFrame.apply()
method is used to apply the lambda function row-by-row and return the rows that match the values.
# By using lambda function
print(df.apply(lambda row: row[df['Courses'].isin(['Spark','PySpark'])]))
Yields below output.
# Output:
Courses Fee Duration Discount
0 Spark 22000 30days 1000
1 PySpark 25000 50days 2300
8. Using With None & nan Data
In case, if you want to drop rows that have None
or nan
values, use DataFrame.dropna() method.
# Filter rows by ignoreing columns
# that have None & Nan values
print(df.dropna())
Yields below output
# Output:
Courses Fee Duration Discount
0 Spark 22000 30days 1000
1 PySpark 25000 50days 2300
2 Hadoop 23000 30days 1000
In case, if you want to drop columns that have None
or nan
, refer to how to drop Pandas DataFrame columns.
print(df.dropna(axis='columns'))
Yields below output.
# Output:
Courses Fee Discount
0 Spark 22000 1000
1 PySpark 25000 2300
2 Hadoop 23000 1000
3 Python 24000 1200
4 Pandas 26000 2500
9. Using Series
# Filter based on value contains
print(df[df['Courses'].str.contains("Spark")])
# Filter after converting values
print(df[df['Courses'].str.lower().str.contains("spark")])
# Filter startswith
print(df[df['Courses'].str.startswith("P")])
FAQ on Pandas Filter Rows by Conditions
To filter rows based on a single condition, you can use boolean indexing. For example, initialize the DataFrame df
and want to filter rows where a column ‘x’ is equal to the 10, you can use df[df['x'] == 10]
.
To filter rows based on multiple conditions, you can use the &
(and) and |
(or) operators for combining conditions. For example, df[(df['x'] > 10) & (df['y'] < 5)]
will filter rows where ‘x’ is greater than 10 and ‘y’ is less than 5.
Use the |
operator for “OR” conditions. For example, df[(df['x'] > 10) | (df['y'] < 5)]
will filter rows where ‘x’ is greater than 10 or ‘y’ is less than 5.
Conclusion
In this article, I have explained different examples of how to filter Pandas Rows based on single and multiple conditions. Remember that when you filter DataFrame Rows, it always returns a new DataFrame with selected rows.
Happy Learning !!
Related Articles
- Pandas apply map (applymap()) Explained
- Pandas apply() return multiple columns
- How to apply apply() function to every row
- Pandas apply() with Lambda Examples
- How to apply Pandas apply() Function to Single & Multiple Column(s)
- Pandas Series apply() function usage
- How to apply multiple filters to Pandas DataFrame or Series
- Difference between map, applymap and apply Methods
- How to add/insert row to Pandas DataFrame?
- Pandas compare two DataFrames row by row
- Difference between two DataFrames
References
- https://pandas.pydata.org/pandas-docs/stable/getting_started/intro_tutorials/03_subset_data.html
- https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html