• Post author:
  • Post category:Pandas
  • Post last modified:October 28, 2024
  • Reading time:18 mins read
You are currently viewing Pandas Select Rows Based on List Index

You can select rows in a Pandas DataFrame based on a list of indices, you can use the DataFrame.iloc[], DataFrame.loc[df.index[]] methods. iloc[] takes row indexes as a list. loc[] takes row labels as a list, hence use df.index[] to get the column names for the indexes. In this article, I will explain how to use a list of indexes to select rows from pandas DataFrame with examples.

Advertisements

Key Points –

  • Use the iloc[] indexer to select rows by index position when you have a list of indices.
  • loc[] can be used for label-based selection if the DataFrame has a labeled index; otherwise, use iloc[] for integer index positions.
  • Providing a list of indices inside iloc[] or loc[] enables selection of multiple, specific rows in a single step.
  • Combining a list of indices with isin() on the DataFrame index can be helpful for checking the existence of rows before selecting.
  • Sorting the list of indices before passing it to iloc[] can improve readability of the selected rows.
  • If indices are non-unique, iloc[] will return multiple rows for each occurrence of an index in the DataFrame.

1. Quick Examples of Select Pandas Rows Based on List Index

If you are in a hurry, below are some quick examples of how to select pandas rows based on list index in Pandas DataFrame.


# Quick examples of select pandas rows based on list index

# How to select Pandas Rows Based on list 
# Using df.iloc[ind_list]
ind_list = [1, 3]
df.iloc[ind_list]

# How to select Pandas rows based on list 
# Using df.loc[df.index[index_list]]
index_list = [0,2]
df.loc[df.index[index_list]]

# Get Pandas rows on list index 
# Using index.isin().
df2=df[df.index.isin([1,3])]
print(df2)

# Get pandas rows on list index by df.time()
df2=df.take([1,2])
print(df2)

# Get Pandas rows on list index by df.query()
index_list = [1,2]
df2=df.query('index in @index_list')

Now, let’s create Pandas DataFrame with a few rows and columns, execute these examples and validate results. Our DataFrame contains column names CoursesFeeDuration, and Discount.


# Create DataFrame
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
df = pd.DataFrame(technologies)
print("Create DataFrame\n",df)

Yields below output.

pandas select rows index

2. Using DataFrame.iloc[] to Select Rows From List Index

DataFrame.iloc[ind_list] method is used to filter/select rows from a list of index values. Pass the indexes you wanted to select as a list to this method. Let’s see with an example.

In this example, ind_list is a list containing the indices of the rows you want to select (1 and 3). You can pass this list to iloc to extract the corresponding rows from the DataFrame.


# Select Rows from List Index 
# Using df.iloc[iloc_lst]
ind_list = [1, 3]
result = df.iloc[ind_list]
print("Select rows from the list index:\n",result)

Yields below output. This selects rows 2 and 4 as the index starts from zero.

pandas select rows index

Remember that the indices in the list should be valid integer positions in the DataFrame. Also, note that the resulting DataFrame (ind_list in this case) will have the same column structure as the original DataFrame (df), and it will be a subset of the original DataFrame based on the specified indices.

3. Using df.loc[df.index[]] to Select Rows From List Index

Alternatively, you can select rows from the list index by using df.loc[df.index[]] method. loc[] method is used to select the rows by labels. so in order to select by index, use df.index[]. This property returns row labels for a given index.


# Select rows from list 
# Using df.loc[df.index[index_list]]
index_list = [0,2]
result = df.loc[df.index[index_list]]
print("Select rows from the list index:\n",result)

Yields below output.


# Output:
# Select rows from the list index:
   Courses    Fee Duration  Discount
0   Spark  20000   30days      1000
2  Python  22000   35days      1200

4. Get Pandas Rows on List Index Using isin()

You can select rows from a list index using index.isin() the method which is used to check each element in the DataFrame is contained in values or not. This is the fasted approach. Note that this option doesn’t work if you have labels for index.

In the below example, df.index.isin(selected_indices) creates a boolean mask that is True for the rows with indices present in the selected_indices list. Applying this mask to the DataFrame using df.loc[] selects the corresponding rows.


# Get Pandas rows on list index 
# Using index.isin()
selected_indices = [1,3]
result = df.loc[df.index.isin(selected_indices)]
print("Select rows from the list index:\n",result)

Yields below output.


# Output:
# Select rows from the list index:
    Courses    Fee Duration  Discount
1  PySpark  25000   40days      2300
3   pandas  30000   50days      2000

5. Get Pandas Rows on List Index by DataFrame.take()

df.take() function is also used to get the elements in the given positional indices along an axis. we are not indexing according to actual values in the index attribute of the object. We are indexing according to the actual position of the element in the object.


# Get pandas rows on list index by df.time().
df2=df.take([1,2])
print(df2)

Yields below output.


# Output:
   Courses    Fee Duration  Discount
1  PySpark  25000   40days      2300
2   Python  22000   35days      1200

6. Get Pandas Rows on List Index by DataFrame.query()

Finally by using df.query() function to query the columns of a DataFrame with a boolean expression to get rows by list of index. For example,


# Get Pandas rows on list index by df.query().
index_list = [1,2]
df2=df.query('index in @index_list')

Yields below output.


# Output:
   Courses    Fee Duration  Discount
1  PySpark  25000   40days      2300
2   Python  22000   35days      1200

7. Complete Examples


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
df = pd.DataFrame(technologies)
print(df)

# Select Rows from List Index 
# Using df.iloc[iloc_lst]
ind_list = [1, 3]
df.iloc[ind_list]

# Select rows from list 
# Using df.loc[df.index[index_list]]
index_list = [0,2]
df.loc[df.index[index_list]]

# Get Pandas rows on list index 
# Using index.isin().
df2=df[df.index.isin([1,3])]
print(df2)

# Get pandas rows on list index by df.time()
df2=df.take([1,2])
print(df2)

# Get Pandas rows on list index by df.query()
index_list = [1,2]
df2=df.query('index in @index_list')

Frequently Asked Questions on Select Rows Based on List Index

How can I select specific rows from a pandas DataFrame based on a list of indices?

To select specific rows from a pandas DataFrame based on a list of indices, you can use the iloc or loc methods.

Is there a way to filter rows based on a condition involving the index values?

You can filter rows based on a condition involving the index values in pandas. One common approach is to use the isin() method with the loc method or boolean indexing.

Can I use the take() method to select rows based on a list of indices?

You can use the take() method in pandas to select rows based on a list of indices. The take() method returns the elements at the specified indices along a particular axis.

Is it possible to use the query() method to select rows based on a list of indices?

While the primary purpose of the query() method in pandas is to perform condition-based queries, you can still use it to filter rows based on a list of indices by creating a query string. However, it’s important to note that other methods like iloc, loc, or take are generally more straightforward for this task.

What if I want to select rows based on a list of indices with a specific order?

If you want to select rows based on a list of indices with a specific order, the take() method is particularly useful. The take() method allows you to retrieve elements from a DataFrame at the specified indices in the order given in the list.

What happens if an index in the list is out of bounds for the DataFrame?

If an index in the list is out of bounds for the DataFrame, using that index with take() or other indexing methods will raise an IndexError. It’s essential to ensure that the indices in your list are valid indices for the DataFrame to avoid such errors.

Conclusion

In this article, you have learned how to use select pandas rows based on list index using DataFrame.iloc[], DataFrame.loc[df.index[index_list]] functions. Also, you have learned how to use Select Pandas Rows Based on List Index using DataFrame.isin() and DataFrame.query() with all above examples.

References

Leave a Reply