To select rows from a Pandas DataFrame based on a list of values, you can use the isin()
method. You can select rows from a list of values in pandas DataFrame either using DataFrame.isin()
, DataFrame.query()
, DataFrame.index()
, DataFrame.loc[]
attribute or DataFrame.apply()
method with a lambda function. In this article, I will explain how to use a list of values to select rows from Pandas DataFrame with examples.
Key Points –
- Use
DataFrame.isin()
to check whether each element is in the specified list of values. isin()
can be applied to one or more columns to filter rows based on the condition.- Use the
@
symbol to substitute a Python variable inside aquery()
expression. - Apply
&
(and) or|
(or) operators to filter on multiple conditions across different columns. - Multiple conditions can be applied with
&
(AND) and|
(OR) for complex filtering. - You can chain
isin()
with the DataFrame’s indexing syntax ([]
) to filter rows. - You can use
DataFrame.query()
for selecting rows based on a list of values by creating conditional expressions. - The
~
(tilde) operator can be used to negate the condition and select rows not in the list.
Related:
Quick Examples of Select DataFrame Rows From List of Values in Pandas
If you are in a hurry, below are some quick examples of how to use a list of values to select rows from Pandas DataFrame.
# Quick examples of select dataframe rows from list of values
# Example 1: Create a list of values for select rows
# Using isin([]) method
list_of_values = [25000, 30000]
df2 = df[df['Fee'].isin(list_of_values)]
# Example 2: Using isin([]) method
# To get opposite rows
df2 = df[~df['Fee'].isin([25000, 30000])]
# Example 3: Using DataFrame.query() method
# To select rows
df2 = df.query('Discount in [1000,2000]')
# Example 4: Filter rows by list of values
df2 = df.query('Discount == [1000,2000]')
# Example 5: Select list of rows using variable
list_of_value = [1000,2000]
df2 = df.query('Discount in @list_of_value')
# Example 6: Using DataFrame.query() method
list_of_value = [1000,2000]
df2 = df.query('Discount == @list_of_value')
# Example 7: Select rows using DataFrame.loc[]
# To apply() and lambda function
df2 = df.loc[df.apply(lambda x: x.Fee in [20000,22000], axis=1)]
# Example 8: Select pandas rows based on list index
index_list = [0,2]
df2 = df.loc[df.index[index_list]]
# Example 9: Using DataFrame.iloc[]
# To select rows list index
index_list = [0,2]
df2 = df.iloc[index_list]
Now, let’s create a DataFrame with a few rows and columns, execute these examples and validate results. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
.
# Create DataFrame
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print("Create DataFrame\n",df)
Yields below output.
Using DataFrame.isin() to Select Rows From List of Values
DataFrame.isin()
method is used to filter/select rows from a list of values. You can have the list of values in variable and use it on isin()
or use it directly.
This program creates a boolean mask using isin()
on the ‘Fee’ column and then uses this mask to select the rows where the ‘Fee’ values match any value in the list_of_values
.
# Create a list of values for select rows
# Using isin([]) method
list_of_values = [25000, 30000]
df2 = df[df['Fee'].isin(list_of_values)]
print("Selecting rows from the list of values:\n",df2)
Yields below output.
Alternatively, you can use the tilde (~)
to get the opposite rows. The tilde is used as a bitwise NOT operator, which in this context is used to invert the boolean mask created by isin([])
.
# Using isin([]) method to get opposite rows
df2 = df[~df['Fee'].isin([25000, 30000])]
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r3 Python 22000 35days 1200
Using DataFrame.query() Method to Select Multiple Rows Values
Also, use pandas.DataFrame.query() to filter Pandas DataFrame rows from a list of multiple values. With this, you can use in
and ==
operators. Select the rows where the ‘Discount’ column values are either 1000 or 2000. The query()
method allows you to write conditions in a SQL-like syntax for convenient DataFrame filtering.
# Using DataFrame.query() method to select rows
df2 = df.query('Discount in [1000,2000]')
print(df2)
# Filter Rows by list of values
df2 = df.query('Discount == [1000,2000]')
print(df2)
# Select list of rows using variable
list_of_value = [1000,2000]
df2 = df.query('Discount in @list_of_value')
print(df2)
# Using DataFrame.query() method
list_of_value = [1000,2000]
df2 = df.query('Discount == @list_of_value')
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r4 pandas 30000 50days 2000
Select Rows Using DataFrame.loc[] with apply() and Lambda
Use DataFrame.loc[]
along with apply()
and lambda function to select rows that are matching in a list. To select the rows where the ‘Fee’ column values are either 20000 or 22000. The apply()
function with a lambda is used to create a boolean mask, and DataFrame.loc[]
is used to select the rows based on this mask.
# Select rows using DataFrame.loc[] to apply() and lambda function
df2 = df.loc[df.apply(lambda x: x.Fee in [20000,22000], axis=1)]
print("Selecting rows from the list of values:\n",df2)
Yields below output.
# Output:
Selecting rows from the list of values:
Courses Fee Duration Discount
r1 Spark 20000 30days 1000
r3 Python 22000 35days 1200
Select Pandas Rows Based on List Index
If you want to select specific rows from a Pandas DataFrame based on a list of indices, you can use the loc[]
or iloc[]
indexing methods. You can use loc[]
or iloc[]
with your desired indices to select rows from DataFrame. For more details refer to Using Index with loc[] to select rows from DataFrame.
Both loc
and iloc
allow you to select rows based on specific indices, but they differ in how they interpret the indices (label-based for loc
and integer-based for iloc
). Choose the one that best fits your needs.
# Select pandas rows based on list index
index_list = [0,2]
df2 = df.loc[df.index[index_list]]
print("Selected rows based on list of indices:\n",df2)
# Using DataFrame.iloc[]
# To select rows list index
index_list = [0,2]
df2 = df.iloc[index_list]
print("Selected rows based on list of indices:\n",df2)
Yields below output.
# Output:
Selected rows based on list of indices:
Courses Fee Duration Discount
0 Spark 20000 30days 1000
2 Python 22000 35days 1200
Complete Example to Select Rows From List of Values in DataFrame
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
df2 = df[df['Duration'].str.contains('35days')]
print(df2)
# Create a list of values for select rows
# Using isin([]) method
list_of_values = [25000, 30000]
df2 = df[df['Fee'].isin(list_of_values)]
print(df2)
# Using isin([]) method to get opposite rows
df2 = df[~df['Fee'].isin([25000, 30000])]
print(df2)
# Using DataFrame.query() method to select rows
df2 = df.query('Discount in [1000,2000]')
print(df2)
# Filter Rows by list of values
df2 = df.query('Discount == [1000,2000]')
print(df2)
# Select list of rows using variable
list_of_value = [1000,2000]
df2 = df.query('Discount in @list_of_value')
print(df2)
# Using DataFrame.query() method
list_of_value = [1000,2000]
df2 = df.query('Discount == @list_of_value')
print(df2)
# Select rows using DataFrame.loc[] to apply() and lambda function
df2 = df.loc[df.apply(lambda x: x.Fee in [20000,22000], axis=1)]
print(df2)
# Select pandas rows based on list index
index_list = [0,2]
df2 = df.loc[df.index[index_list]]
print(df2)
# Using DataFrame.iloc[]
# To select rows list index
index_list = [0,2]
df2 = df.iloc[index_list]
print(df2)
FAQ on Select Rows From List of Values in Pandas DataFrame
You can select rows from a Pandas DataFrame based on multiple conditions by combining the conditions using logical operators like &
(AND) and |
(OR). For example, the conditions are combined using the &
operator. You can adjust the conditions to match your specific criteria. If you want to use the OR operator, you can use |
.
To select rows based on multiple values in one column of a Pandas DataFrame, you can use the isin
method.
You can use the query()
method in Pandas to select rows based on a condition. The query()
method allows you to filter rows using a query expression in a SQL-like syntax.
To select rows based on a list of indices in a Pandas DataFrame, you can use the loc
or iloc
method. For example, df.loc[selected_indices]
is used to select rows based on the specified list of indices. The loc
method is label-based, so it takes the actual index labels of the DataFrame.
You can select rows based on a custom function applied to a column in a Pandas DataFrame. You can use the apply()
method along with a lambda function or a custom function.
Conclusion
In this article, you have learned how to use a list of values to select rows from DataFrame using DataFrame.isin()
, DataFrame.query()
and DataFrame.index()
functions. Also, you have learned to select rows with row indices.
Happy Learning !!
Related Articles
- Rename Index Values of Pandas DataFrame
- Pandas Drop Last N Rows From DataFrame
- Pandas Drop First N Rows From DataFrame
- Select Multiple Columns in Pandas DataFrame
- Convert String to Float in Pandas DataFrame
- Pandas Select Rows Based on List Index
- Pandas Difference Between Two DataFrames
- Pandas Select DataFrame Rows Between Two Dates
- Pandas Select Rows Based on List Index
- Pandas Select Multiple Columns in DataFrame
- How to add/insert row to Pandas DataFrame?
- Pandas get the number of rows from DataFrame