You can select rows in a Pandas DataFrame based on a list of indices, you can use the DataFrame.iloc[]
, DataFrame.loc[df.index[]]
methods. iloc[]
takes row indexes as a list. loc[]
takes row labels as a list, hence use df.index[]
to get the column names for the indexes. In this article, I will explain how to use a list of indexes to select rows from pandas DataFrame with examples.
Key Points –
- Use the
iloc[]
indexer to select rows by index position when you have a list of indices. loc[]
can be used for label-based selection if the DataFrame has a labeled index; otherwise, useiloc[]
for integer index positions.- Providing a list of indices inside
iloc[]
orloc[]
enables selection of multiple, specific rows in a single step. - Combining a list of indices with
isin()
on the DataFrame index can be helpful for checking the existence of rows before selecting. - Sorting the list of indices before passing it to
iloc[]
can improve readability of the selected rows. - If indices are non-unique,
iloc[]
will return multiple rows for each occurrence of an index in the DataFrame.
1. Quick Examples of Select Pandas Rows Based on List Index
If you are in a hurry, below are some quick examples of how to select pandas rows based on list index in Pandas DataFrame.
# Quick examples of select pandas rows based on list index
# How to select Pandas Rows Based on list
# Using df.iloc[ind_list]
ind_list = [1, 3]
df.iloc[ind_list]
# How to select Pandas rows based on list
# Using df.loc[df.index[index_list]]
index_list = [0,2]
df.loc[df.index[index_list]]
# Get Pandas rows on list index
# Using index.isin().
df2=df[df.index.isin([1,3])]
print(df2)
# Get pandas rows on list index by df.time()
df2=df.take([1,2])
print(df2)
# Get Pandas rows on list index by df.query()
index_list = [1,2]
df2=df.query('index in @index_list')
Now, let’s create Pandas DataFrame with a few rows and columns, execute these examples and validate results. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
.
# Create DataFrame
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
df = pd.DataFrame(technologies)
print("Create DataFrame\n",df)
Yields below output.
2. Using DataFrame.iloc[] to Select Rows From List Index
DataFrame.iloc[ind_list]
method is used to filter/select rows from a list of index values. Pass the indexes you wanted to select as a list to this method. Let’s see with an example.
In this example, ind_list
is a list containing the indices of the rows you want to select (1 and 3). You can pass this list to iloc
to extract the corresponding rows from the DataFrame.
# Select Rows from List Index
# Using df.iloc[iloc_lst]
ind_list = [1, 3]
result = df.iloc[ind_list]
print("Select rows from the list index:\n",result)
Yields below output. This selects rows 2 and 4 as the index starts from zero.
Remember that the indices in the list should be valid integer positions in the DataFrame. Also, note that the resulting DataFrame (ind_list
in this case) will have the same column structure as the original DataFrame (df
), and it will be a subset of the original DataFrame based on the specified indices.
3. Using df.loc[df.index[]] to Select Rows From List Index
Alternatively, you can select rows from the list index by using df.loc[df.index[]]
method. loc[]
method is used to select the rows by labels. so in order to select by index, use df.index[]
. This property returns row labels for a given index.
# Select rows from list
# Using df.loc[df.index[index_list]]
index_list = [0,2]
result = df.loc[df.index[index_list]]
print("Select rows from the list index:\n",result)
Yields below output.
# Output:
# Select rows from the list index:
Courses Fee Duration Discount
0 Spark 20000 30days 1000
2 Python 22000 35days 1200
4. Get Pandas Rows on List Index Using isin()
You can select rows from a list index using index.isin()
the method which is used to check each element in the DataFrame is contained in values or not. This is the fasted approach. Note that this option doesn’t work if you have labels for index.
In the below example, df.index.isin(selected_indices)
creates a boolean mask that is True
for the rows with indices present in the selected_indices
list. Applying this mask to the DataFrame using df.loc[]
selects the corresponding rows.
# Get Pandas rows on list index
# Using index.isin()
selected_indices = [1,3]
result = df.loc[df.index.isin(selected_indices)]
print("Select rows from the list index:\n",result)
Yields below output.
# Output:
# Select rows from the list index:
Courses Fee Duration Discount
1 PySpark 25000 40days 2300
3 pandas 30000 50days 2000
5. Get Pandas Rows on List Index by DataFrame.take()
df.take()
function is also used to get the elements in the given positional indices along an axis. we are not indexing according to actual values in the index attribute of the object. We are indexing according to the actual position of the element in the object.
# Get pandas rows on list index by df.time().
df2=df.take([1,2])
print(df2)
Yields below output.
# Output:
Courses Fee Duration Discount
1 PySpark 25000 40days 2300
2 Python 22000 35days 1200
6. Get Pandas Rows on List Index by DataFrame.query()
Finally by using df.query()
function to query the columns of a DataFrame with a boolean expression to get rows by list of index. For example,
# Get Pandas rows on list index by df.query().
index_list = [1,2]
df2=df.query('index in @index_list')
Yields below output.
# Output:
Courses Fee Duration Discount
1 PySpark 25000 40days 2300
2 Python 22000 35days 1200
7. Complete Examples
import pandas as pd
technologies = {
'Courses':["Spark","PySpark","Python","pandas"],
'Fee' :[20000,25000,22000,30000],
'Duration':['30days','40days','35days','50days'],
'Discount':[1000,2300,1200,2000]
}
df = pd.DataFrame(technologies)
print(df)
# Select Rows from List Index
# Using df.iloc[iloc_lst]
ind_list = [1, 3]
df.iloc[ind_list]
# Select rows from list
# Using df.loc[df.index[index_list]]
index_list = [0,2]
df.loc[df.index[index_list]]
# Get Pandas rows on list index
# Using index.isin().
df2=df[df.index.isin([1,3])]
print(df2)
# Get pandas rows on list index by df.time()
df2=df.take([1,2])
print(df2)
# Get Pandas rows on list index by df.query()
index_list = [1,2]
df2=df.query('index in @index_list')
Frequently Asked Questions on Select Rows Based on List Index
To select specific rows from a pandas DataFrame based on a list of indices, you can use the iloc
or loc
methods.
You can filter rows based on a condition involving the index values in pandas. One common approach is to use the isin()
method with the loc
method or boolean indexing.
You can use the take()
method in pandas to select rows based on a list of indices. The take()
method returns the elements at the specified indices along a particular axis.
While the primary purpose of the query()
method in pandas is to perform condition-based queries, you can still use it to filter rows based on a list of indices by creating a query string. However, it’s important to note that other methods like iloc
, loc
, or take
are generally more straightforward for this task.
If you want to select rows based on a list of indices with a specific order, the take()
method is particularly useful. The take()
method allows you to retrieve elements from a DataFrame at the specified indices in the order given in the list.
If an index in the list is out of bounds for the DataFrame, using that index with take()
or other indexing methods will raise an IndexError
. It’s essential to ensure that the indices in your list are valid indices for the DataFrame to avoid such errors.
Conclusion
In this article, you have learned how to use select pandas rows based on list index using DataFrame.iloc[]
, DataFrame.loc[df.index[index_list]]
functions. Also, you have learned how to use Select Pandas Rows Based on List Index using DataFrame.isin()
and DataFrame.query()
with all above examples.
Related Articles
- Drop Infinite Values From Pandas DataFrame
- Pandas Get List of All Duplicate Rows
- How to Append Pandas Series?
- Pandas loc[] attribute multiple conditions
- Pandas Series loc[] attribute
- How to Read Excel Multiple Sheets in Pandas
- Pandas Rename Column with Examples
- How to Merge Series into Pandas DataFrame
- How to Slice Columns in Pandas DataFrame
- Create DataFrame From Multiple Series in Pandas
- Pandas Operator Chaining to Filter DataFrame Rows
- How to Drop Duplicate Columns in pandas DataFrame