• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:18 mins read
You are currently viewing Select Rows From List of Values in Pandas DataFrame

To select rows from a Pandas DataFrame based on a list of values, you can use the isin() method. You can select rows from a list of values in pandas DataFrame either using DataFrame.isin(), DataFrame.query(), DataFrame.index(), DataFrame.loc[] attribute or DataFrame.apply() method with a lambda function. In this article, I will explain how to use a list of values to select rows from Pandas DataFrame with examples.

Related:

1. Quick Examples of Select DataFrame Rows From List of Values in Pandas

If you are in a hurry, below are some quick examples of how to use a list of values to select rows from Pandas DataFrame.


# Below are quick examples

# Example 1: Create a list of values for select rows 
# Using isin([]) method
list_of_values = [25000, 30000]
df2 = df[df['Fee'].isin(list_of_values)]

# Example 2: Using isin([]) method 
# To get opposite rows
df2 = df[~df['Fee'].isin([25000, 30000])]

# Example 3: Using DataFrame.query() method 
# To select rows
df2 = df.query('Discount in [1000,2000]')

# Example 4: Filter rows by list of values
df2 = df.query('Discount == [1000,2000]')

# Example 5: Select list of rows using variable
list_of_value = [1000,2000]
df2 = df.query('Discount in @list_of_value')

# Example 6: Using DataFrame.query() method
list_of_value = [1000,2000]
df2 = df.query('Discount == @list_of_value')

# Example 7: Select rows using DataFrame.loc[] 
# To apply() and lambda function 
df2 = df.loc[df.apply(lambda x: x.Fee in [20000,22000], axis=1)]

# Example 8: Select pandas rows based on list index
index_list = [0,2]
df2 = df.loc[df.index[index_list]]

# Example 9: Using DataFrame.iloc[] 
# To select rows list index
index_list = [0,2]
df2 = df.iloc[index_list]

Now, let’s create a DataFrame with a few rows and columns, execute these examples and validate results. Our DataFrame contains column names Courses, Fee, Duration, and Discount.


# Create DataFrame
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
print("Create DataFrame\n",df)

Yields below output.

pandas select rows list

2. Using DataFrame.isin() to Select Rows From List of Values

DataFrame.isin() method is used to filter/select rows from a list of values. You can have the list of values in variable and use it on isin() or use it directly.

This program creates a boolean mask using isin() on the ‘Fee’ column and then uses this mask to select the rows where the ‘Fee’ values match any value in the list_of_values.


# Create a list of values for select rows 
# Using isin([]) method
list_of_values = [25000, 30000]
df2 = df[df['Fee'].isin(list_of_values)]
print("Selecting rows from the list of values:\n",df2)

Yields below output.

pandas select rows list

Alternatively, you can use the tilde (~) to get the opposite rows. The tilde is used as a bitwise NOT operator, which in this context is used to invert the boolean mask created by isin([]).


# Using isin([]) method to get opposite rows
df2 = df[~df['Fee'].isin([25000, 30000])]
print(df2)

Yields below output.


# Output:
   Courses    Fee Duration  Discount
r1   Spark  20000   30days      1000
r3  Python  22000   35days      1200

3. Using DataFrame.query() Method to Select Multiple Rows Values

Also, use pandas.DataFrame.query() to filter Pandas DataFrame rows from a list of multiple values. With this, you can use in and == operators. Select the rows where the ‘Discount’ column values are either 1000 or 2000. The query() method allows you to write conditions in a SQL-like syntax for convenient DataFrame filtering.


# Using DataFrame.query() method to select rows
df2 = df.query('Discount in [1000,2000]')
print(df2)

# Filter Rows by list of values
df2 = df.query('Discount == [1000,2000]')
print(df2)

# Select list of rows using variable
list_of_value = [1000,2000]
df2 = df.query('Discount in @list_of_value')
print(df2)

# Using DataFrame.query() method
list_of_value = [1000,2000]
df2 = df.query('Discount == @list_of_value')
print(df2)

Yields below output.


# Output:
   Courses    Fee Duration  Discount
r1   Spark  20000   30days      1000
r4  pandas  30000   50days      2000

4. Select Rows Using DataFrame.loc[] with apply() and Lambda

Use DataFrame.loc[] along with apply() and lambda function to select rows that are matching in a list. To select the rows where the ‘Fee’ column values are either 20000 or 22000. The apply() function with a lambda is used to create a boolean mask, and DataFrame.loc[] is used to select the rows based on this mask.


# Select rows using DataFrame.loc[] to apply() and lambda function 
df2 = df.loc[df.apply(lambda x: x.Fee in [20000,22000], axis=1)]
print("Selecting rows from the list of values:\n",df2)

Yields below output.


# Output:
Selecting rows from the list of values:
   Courses    Fee Duration  Discount
r1   Spark  20000   30days      1000
r3  Python  22000   35days      1200

5. Select Pandas Rows Based on List Index

If you want to select specific rows from a Pandas DataFrame based on a list of indices, you can use the loc[] or iloc[] indexing methods. You can use loc[] or iloc[] with your desired indices to select rows from DataFrame. For more details refer to Using Index with loc[] to select rows from DataFrame.

Both loc and iloc allow you to select rows based on specific indices, but they differ in how they interpret the indices (label-based for loc and integer-based for iloc). Choose the one that best fits your needs.


# Select pandas rows based on list index
index_list = [0,2]
df2 = df.loc[df.index[index_list]]
print("Selected rows based on list of indices:\n",df2)

# Using DataFrame.iloc[] 
# To select rows list index
index_list = [0,2]
df2 = df.iloc[index_list]
print("Selected rows based on list of indices:\n",df2)

Yields below output.


# Output:
Selected rows based on list of indices:
  Courses    Fee Duration  Discount
0   Spark  20000   30days      1000
2  Python  22000   35days      1200

6. Complete Example to Select Rows From List of Values in DataFrame


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
index_labels=['r1','r2','r3','r4']
df = pd.DataFrame(technologies,index=index_labels)
df2 = df[df['Duration'].str.contains('35days')]
print(df2)

# Create a list of values for select rows 
# Using isin([]) method
list_of_values = [25000, 30000]
df2 = df[df['Fee'].isin(list_of_values)]
print(df2)

# Using isin([]) method to get opposite rows
df2 = df[~df['Fee'].isin([25000, 30000])]
print(df2)

# Using DataFrame.query() method to select rows
df2 = df.query('Discount in [1000,2000]')
print(df2)

# Filter Rows by list of values
df2 = df.query('Discount == [1000,2000]')
print(df2)

# Select list of rows using variable
list_of_value = [1000,2000]
df2 = df.query('Discount in @list_of_value')
print(df2)

# Using DataFrame.query() method
list_of_value = [1000,2000]
df2 = df.query('Discount == @list_of_value')
print(df2)

# Select rows using DataFrame.loc[] to apply() and lambda function 
df2 = df.loc[df.apply(lambda x: x.Fee in [20000,22000], axis=1)]
print(df2)

# Select pandas rows based on list index
index_list = [0,2]
df2 = df.loc[df.index[index_list]]
print(df2)

# Using DataFrame.iloc[] 
# To select rows list index
index_list = [0,2]
df2 = df.iloc[index_list]
print(df2)

Frequently Asked Questions

How can I select rows from a Pandas DataFrame based on multiple conditions?

You can select rows from a Pandas DataFrame based on multiple conditions by combining the conditions using logical operators like & (AND) and | (OR). For example, the conditions are combined using the & operator. You can adjust the conditions to match your specific criteria. If you want to use the OR operator, you can use |.

How can I select rows based on multiple values in one column?

To select rows based on multiple values in one column of a Pandas DataFrame, you can use the isin method.

Can I use the query() method to select rows based on a condition?

You can use the query() method in Pandas to select rows based on a condition. The query() method allows you to filter rows using a query expression in a SQL-like syntax.

How can I select rows based on a list of indices?

To select rows based on a list of indices in a Pandas DataFrame, you can use the loc or iloc method. For example, df.loc[selected_indices] is used to select rows based on the specified list of indices. The loc method is label-based, so it takes the actual index labels of the DataFrame.

Is there a way to select rows based on a custom function applied to a column?

You can select rows based on a custom function applied to a column in a Pandas DataFrame. You can use the apply() method along with a lambda function or a custom function.

Conclusion

In this article, you have learned how to use a list of values to select rows from DataFrame using DataFrame.isin(), DataFrame.query() and DataFrame.index() functions. Also, you have learned to select rows with row indices.

Happy Learning !!

References

Malli

Malli is an experienced technical writer with a passion for translating complex Python concepts into clear, concise, and user-friendly articles. Over the years, he has written hundreds of articles in Pandas, NumPy, Python, and takes pride in ability to bridge the gap between technical experts and end-users.

Leave a Reply