Pandas DataFrame.query()
method is used to query the rows based on the provided expression (single or multiple column conditions) and returns a new DataFrame. In case you want to update the existing referring DataFrame use inplace=True
argument.
In this article, I will explain the syntax of the Pandas DataFrame query() method and several working examples like a query with multiple conditions and a query with a string containing to new few.
Related:
- pandas.DataFrame.filter() – To filter rows by index and columns by name.
- pandas.DataFrame.loc[] – To select rows by indices label and column by name.
- pandas.DataFrame.iloc[] – To select rows by index and column by position.
- pandas.DataFrame.apply() – To custom select using lambda function.
1. Quick Examples of pandas query()
If you are in a hurry, below are quick examples of how to use pandas.DataFrame.query() method.
# Below are the quick examples.
# Query Rows using DataFrame.query()
df2=df.query("Courses == 'Spark'")
# Using variable
value='Spark'
df2=df.query("Courses == @value")
# Inpace
df.query("Courses == 'Spark'",inplace=True)
# Not equals, in & multiple conditions
df.query("Courses != 'Spark'")
df.query("Courses in ('Spark','PySpark')")
df.query("`Courses Fee` >= 23000")
df.query("`Courses Fee` >= 23000 and `Courses Fee` <= 24000")
If you are a learner, Let’s see with sample data then run through these examples, and explore the output to understand better. First, let’s create a pandas DataFrame from Dict.
# Create DataFrame
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
'Fee' :[22000,25000,23000,24000,26000],
'Duration':['30days','50days','30days', None,np.nan],
'Discount':[1000,2300,1000,1200,2500]
}
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
Yields below output.

Note that the above DataFrame also contains None
and NaN
values on the Duration column that I would be using in my examples below to select rows that have None
& NaN
values or select ignoring these values.
Related: You can drop the Pandas rows with NaN values.
3. Using DataFrame.query()
Following is the syntax of the DataFrame.query()
method.
# Query() method syntax
DataFrame.query(expr, inplace=False, **kwargs)
expr
– expression takes conditions to query rowsinplace
– Defaults toFalse
. When it is set toTrue
, it updates the existing DataFrame, andquery()
method returnsNone
.**kwargs
– Keyword arguments that work with eval()
DataFrame.query() is used to filter rows based on one or multiple conditions in an expression.
# Query all rows with Courses equals 'Spark'
df2 = df.query("Courses == 'Spark'")
print("After filtering the rows based on condition:\n", df2)
Yields below output.

In case you want to use a variable in the expression, use @ character
.
# Query Rows by using Python variable
value='Spark'
df2 = df.query("Courses == @value")
print("After filtering the rows based on condition:\n", df2)
Yields the same output as above.
If you notice the above examples return a new DataFrame after filtering the rows. if you want to update the existing DataFrame use inplace=True
# Replace current esisting DataFrame
df.query("Courses == 'Spark'",inplace=True)
print("After filtering the rows based on condition:\n", df)
Yields the same output as above.
If you want to select based on column value not equals then use != operator
.
# Not equals condition
df2 = df.query("Courses != 'Spark'")
print("After filtering the rows based on condition:\n", df2)
Yields below output.
# Output:
Courses Courses Fee Duration Discount
1 PySpark 25000 50days 2300
2 Hadoop 23000 30days 1000
3 Python 24000 None 1200
4 Pandas 26000 NaN 2500
4. Select Rows Based on List of Column Values
If you have values in a Python list and want to select the rows based on the list of values, use in
operator, it’s like checking a value contained in a list of string values.
# Query Rows by list of values
df2 = df.query("Courses in ('Spark','PySpark')")
print("After filtering the rows based on condition:\n", df2)
Yields below output.
# Output:
# After filtering the rows based on condition:
Courses Fee Duration Discount
0 Spark 22000 30days 1000
1 PySpark 25000 50days 2300
You can also write with a list of values in a Python variable.
# Query Rows by list of values
values=['Spark','PySpark']
df2 = df.query("Courses in @values")
print("After filtering the rows based on condition:\n", df2)
Yields the same output as above.
Selecting rows that are not in a list of column values can be done using not-in
operator.
# Query Rows not in list of values
values=['Spark','PySpark']
df2 = df.query("Courses not in @values")
print("After filtering the rows based on condition:\n", df)
Yields below output.
# Output:
# After filtering the rows based on condition:
Courses Fee Duration Discount
2 Hadoop 23000 30days 1000
3 Python 24000 None 1200
4 Pandas 26000 NaN 2500
If you have column names with special characters you can use column names surrounded by a tick `
character.
# Create DataFrame
technologies= {
'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
'Courses Fee' :[22000,25000,23000,24000,26000],
'Duration':['30days','50days','30days', None,np.nan],
'Discount':[1000,2300,1000,1200,2500]
}
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
# Using columns with special characters
df2 = df.query("`Courses Fee` >= 23000")
print("After filtering the rows based on condition:\n", df2)
Yields below output.
# Output:
# Create DataFrame:
Courses Courses Fee Duration Discount
0 Spark 22000 30days 1000
1 PySpark 25000 50days 2300
2 Hadoop 23000 30days 1000
3 Python 24000 None 1200
4 Pandas 26000 NaN 2500
# After filtering the rows based on condition:
Courses Courses Fee Duration Discount
1 PySpark 25000 50days 2300
2 Hadoop 23000 30days 1000
3 Python 24000 None 1200
4 Pandas 26000 NaN 2500
5. Query with Multiple Conditions
In Pandas or any table-like structures, most of the time we would need to select the rows based on multiple conditions by using multiple columns, you can do that in Pandas DataFrame as below.
# Query by multiple conditions
df2 = df.query("`Courses Fee` >= 23000 and `Courses Fee` <= 24000")
print("After filtering the rows based on multiple conditions:\n", df2)
Yields below output. Alternatively, you can also use pandas loc with multiple conditions.
# Output:
# After filtering the rows based on multiple conditions:
Courses Courses Fee Duration Discount
2 Hadoop 23000 30days 1000
3 Python 24000 None 1200
6. Query Rows using apply()
pandas.DataFrame.apply() method is used to apply the expression row-by-row and return the rows that match the values. The below example returns every match when Courses contains a list of specified string values.
# By using lambda function
df2 = df.apply(lambda row: row[df['Courses'].isin(['Spark','PySpark'])])
print("After filtering the rows based on condition:\n", df2)
Yields below output. A lambda expression is used with pandas to apply the function for each row.
# Output:
# After filtering the rows based on condition:
Courses Fee Duration Discount
0 Spark 22000 30days 1000
1 PySpark 25000 50days 2300
8. Other Examples using df[] and loc[]
# Other examples you can try to query rows
df[df["Courses"] == 'Spark']
df.loc[df['Courses'] == value]
df.loc[df['Courses'] != 'Spark']
df.loc[df['Courses'].isin(values)]
df.loc[~df['Courses'].isin(values)]
df.loc[(df['Discount'] >= 1000) & (df['Discount'] <= 2000)]
df.loc[(df['Discount'] >= 1200) & (df['Fee'] >= 23000 )]
# Select based on value contains
print(df[df['Courses'].str.contains("Spark")])
# Select after converting values
print(df[df['Courses'].str.lower().str.contains("spark")])
# Select startswith
print(df[df['Courses'].str.startswith("P")])
Frequently Asked Questions on Pandas DataFrame Query Examples
You can use the query()
method to filter the based on a single condition. For example, query_result = df.query("column == 'value'")
You can use the query()
method to filter the based multiple conditions. For example, query_result = df.query("`column name1` >= 23000 and `Column name2` <= 24000")
In Pandas, the query()
method is used for DataFrame querying based on a boolean expression. This method allows you to filter rows from a DataFrame that meet a specified condition.
Conclusion
In this article, I have explained multiple examples of how to query Pandas DataFrame Rows based on single and multiple conditions, from a list of values (checking column value exists in the list of string values), etc. Remember when you query DataFrame Rows, it always returns a new DataFrame with selected rows, in order to update existing df you have to use inplace=True
. I hope this article helps you learn about Pandas.
Happy Learning !!
Related Articles
- Different Ways to Rename Pandas DataFrame Column
- How to Drop Column From Pandas DataFrame
- Pandas- How to get a Specific Cell Value from DataFrame
- Pandas Filter DataFrame by Multiple Conditions
- Pandas apply map (applymap()) Explained
- Apply Multiple Filters to Pandas DataFrame or Series
- Pandas Filter Rows by Conditions
- Pandas Filter by Column Value
- Pandas Read SQL Query or Table with Examples
- Pandas DataFrame fillna() function
Hi there, I liked this article. One point…
Perhaps the implementation has changed, but I don’t think the pandas query likes spaces in the column names. I had a column named “Return Date” and I had to change it to “Return_Date” to avoid getting a key error.