• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:17 mins read
You are currently viewing Pandas Select Rows Based on List Index

You can select rows in a Pandas DataFrame based on a list of indices, you can use the DataFrame.iloc[], DataFrame.loc[df.index[]] methods. iloc[] takes row indexes as a list. loc[] takes row labels as a list, hence use df.index[] to get the column names for the indexes. In this article, I will explain how to use a list of indexes to select rows from pandas DataFrame with examples.

1. Quick Examples of Select Pandas Rows Based on List Index

If you are in a hurry, below are some quick examples of how to select pandas rows based on list index in Pandas DataFrame.


# Below are quick examples.

# How to select Pandas Rows Based on list 
# Using df.iloc[ind_list]
ind_list = [1, 3]
df.iloc[ind_list]

# How to select Pandas Rows Based on list 
# Using df.loc[df.index[index_list]]
index_list = [0,2]
df.loc[df.index[index_list]]

# Get Pandas rows on list index 
# Using index.isin().
df2=df[df.index.isin([1,3])]
print(df2)

# Get pandas rows on list index by df.time()
df2=df.take([1,2])
print(df2)

# Get Pandas rows on list index by df.query()
index_list = [1,2]
df2=df.query('index in @index_list')

Now, let’s create Pandas DataFrame with a few rows and columns, execute these examples and validate results. Our DataFrame contains column names CoursesFeeDuration, and Discount.


# Create DataFrame
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
df = pd.DataFrame(technologies)
print("Create DataFrame\n",df)

Yields below output.

pandas select rows index

2. Using DataFrame.iloc[] to Select Rows From List Index

DataFrame.iloc[ind_list] method is used to filter/select rows from a list of index values. Pass the indexes you wanted to select as a list to this method. Let’s see with an example.

In this example, ind_list is a list containing the indices of the rows you want to select (1 and 3). You can pass this list to iloc to extract the corresponding rows from the DataFrame.


# Select Rows from List Index 
# Using df.iloc[iloc_lst]
ind_list = [1, 3]
result = df.iloc[ind_list]
print("Select rows from the list index:\n",result)

Yields below output. This selects rows 2 and 4 as the index starts from zero.

pandas select rows index

Remember that the indices in the list should be valid integer positions in the DataFrame. Also, note that the resulting DataFrame (ind_list in this case) will have the same column structure as the original DataFrame (df), and it will be a subset of the original DataFrame based on the specified indices.

3. Using df.loc[df.index[]] to Select Rows From List Index

Alternatively, you can select rows from the list index by using df.loc[df.index[]] method. loc[] method is used to select the rows by labels. so in order to select by index, use df.index[]. This property returns row labels for a given index.


# Select rows from list 
# Using df.loc[df.index[index_list]]
index_list = [0,2]
result = df.loc[df.index[index_list]]
print("Select rows from the list index:\n",result)

Yields below output.


# Output:
# Select rows from the list index:
   Courses    Fee Duration  Discount
0   Spark  20000   30days      1000
2  Python  22000   35days      1200

4. Get Pandas Rows on List Index Using isin()

You can select rows from a list index using index.isin() the method which is used to check each element in the DataFrame is contained in values or not. This is the fasted approach. Note that this option doesn’t work if you have labels for index.

In the below example, df.index.isin(selected_indices) creates a boolean mask that is True for the rows with indices present in the selected_indices list. Applying this mask to the DataFrame using df.loc[] selects the corresponding rows.


# Get Pandas rows on list index 
# Using index.isin()
selected_indices = [1,3]
result = df.loc[df.index.isin(selected_indices)]
print("Select rows from the list index:\n",result)

Yields below output.


# Output:
# Select rows from the list index:
    Courses    Fee Duration  Discount
1  PySpark  25000   40days      2300
3   pandas  30000   50days      2000

5. Get Pandas Rows on List Index by DataFrame.take()

df.take() function is also used to get the elements in the given positional indices along an axis. we are not indexing according to actual values in the index attribute of the object. We are indexing according to the actual position of the element in the object.


# Get pandas rows on list index by df.time().
df2=df.take([1,2])
print(df2)

Yields below output.


# Output:
   Courses    Fee Duration  Discount
1  PySpark  25000   40days      2300
2   Python  22000   35days      1200

6. Get Pandas Rows on List Index by DataFrame.query()

Finally by using df.query() function to query the columns of a DataFrame with a boolean expression to get rows by list of index. For example,


# Get Pandas rows on list index by df.query().
index_list = [1,2]
df2=df.query('index in @index_list')

Yields below output.


# Output:
   Courses    Fee Duration  Discount
1  PySpark  25000   40days      2300
2   Python  22000   35days      1200

7. Complete Examples


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Python","pandas"],
    'Fee' :[20000,25000,22000,30000],
    'Duration':['30days','40days','35days','50days'],
    'Discount':[1000,2300,1200,2000]
              }
df = pd.DataFrame(technologies)
print(df)

# Select Rows from List Index 
# Using df.iloc[iloc_lst]
ind_list = [1, 3]
df.iloc[ind_list]

# Select rows from list 
# Using df.loc[df.index[index_list]]
index_list = [0,2]
df.loc[df.index[index_list]]

# Get Pandas rows on list index 
# Using index.isin().
df2=df[df.index.isin([1,3])]
print(df2)

# Get pandas rows on list index by df.time()
df2=df.take([1,2])
print(df2)

# Get Pandas rows on list index by df.query()
index_list = [1,2]
df2=df.query('index in @index_list')

Frequently Asked Questions on Select Rows Based on List Index

How can I select specific rows from a pandas DataFrame based on a list of indices?

To select specific rows from a pandas DataFrame based on a list of indices, you can use the iloc or loc methods.

Is there a way to filter rows based on a condition involving the index values?

You can filter rows based on a condition involving the index values in pandas. One common approach is to use the isin() method with the loc method or boolean indexing.

Can I use the take() method to select rows based on a list of indices?

You can use the take() method in pandas to select rows based on a list of indices. The take() method returns the elements at the specified indices along a particular axis.

Is it possible to use the query() method to select rows based on a list of indices?

While the primary purpose of the query() method in pandas is to perform condition-based queries, you can still use it to filter rows based on a list of indices by creating a query string. However, it’s important to note that other methods like iloc, loc, or take are generally more straightforward for this task.

What if I want to select rows based on a list of indices with a specific order?

If you want to select rows based on a list of indices with a specific order, the take() method is particularly useful. The take() method allows you to retrieve elements from a DataFrame at the specified indices in the order given in the list.

What happens if an index in the list is out of bounds for the DataFrame?

If an index in the list is out of bounds for the DataFrame, using that index with take() or other indexing methods will raise an IndexError. It’s essential to ensure that the indices in your list are valid indices for the DataFrame to avoid such errors.

Conclusion

In this article, you have learned how to use select pandas rows based on list index using DataFrame.iloc[], DataFrame.loc[df.index[index_list]] functions. Also, you have learned how to use Select Pandas Rows Based on List Index using DataFrame.isin() and DataFrame.query() with all above examples.

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium

Leave a Reply