Pandas iloc[] Usage with Examples

pandas.DataFrame.iloc[] is a property that is used to select rows and columns by position/index. If the position/index does not exist, it gives an index error. In this article, I will cover usage and examples of pandas iloc.

1 pandas.DataFrame.iloc[] Syntax & Usage

DataFrame.iloc[] is index-based to select rows and/or columns in pandas. It accepts a single index, multiple indexes from the list, indexes by a range, and many more.

One of the main advantages of DataFrame is its ease of use. You can see this yourself when you use loc[] or iloc[] attributes to select or filter DataFrame rows or columns. These are mostly used attributes in DataFrame.

START is the integer index of the row/column.
STOP is the integer index of the last row/column where you wanted to stop the selection, and
STEP as the number of indices to advance after each extraction.

Some point to note about iloc[].

By not providing a start index, iloc[] selects from the first row/column.
By not providing stop, iloc[] selects all rows/columns from the start index.
Providing both start and stop, selects all rows/columns in between.

Let’s create a DataFrame and run some examples of pandas iloc.


# Create pandas DataFrame 
import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
    'Fee' :[20000,25000,26000,22000,24000],
    'Duration':['30days','40days','35days','40days','60days'],
    'Discount':[1000,2300,1200,2500,2000]
              }
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(technologies,index=index_labels)
print("Create DataFrame:\n",df)

Yields below output.

2. Select Single Row & Column By Index

Using iloc[] you can select a single row and column by index. The below example demonstrates how to select row by index. The second row (index ‘r2’) of the DataFrame is selected using iloc[1], and it prints the corresponding values for each column in that row.


# Select single row by index
print(df.iloc[1])

Yields below output.

Using iloc[] to select a single column by index. Specifically, it’s selecting all rows (:) for the column with index 0.

In the below example, it selects the entire first column (Courses) from the DataFrame using integer-based indexing. Each element in the result corresponds to the value in the first column for the respective row.


# Select single column by index
print(df.iloc[:, 0])

# Outputs:
# r1      Spark
# r2    PySpark
# r3     Hadoop
# r4     Python
# r5     pandas
Name: Courses, dtype: object

3. Select Multiple Rows & Columns by Index

To select multiple rows and columns, use the integer index as a list to iloc[] attribute. Below is an example of how to select rows by index. It selects the rows with indices 1 and 2, providing a DataFrame that includes the specified rows.


# Select multiple rows by index
print(df.iloc[[1,2]])

# Outputs:
#    Courses    Fee Duration  Discount
# r2  PySpark  25000   40days      2300
# r3   Hadoop  26000   35days      1200

Similarly, to select multiple columns from pandas DataFrame. For example, it selects the specified columns (Courses, Fee, and Discount) for all rows in the DataFrame.


# Select multiple columns by index
print(df.iloc[:, [0,1,3]])

# Outputs:
#    Courses    Fee  Discount
# r1    Spark  20000      1000
# r2  PySpark  25000      2300
# r3   Hadoop  26000      1200
# r4   Python  22000      2500
# r5   pandas  24000      2000

4. Select Rows or Columns by Index Range

By using iloc[], you can also select rows and columns by range. For example all items between two rows/columns. all items starting from e.t.c. The below example selects rows in between 0 and 4 row indices.


# Select rows between two indexs
# Includes Index 0 & Execludes 4
print(df.iloc[0:4])

# Outputs:
#    Courses    Fee Duration  Discount
# r1    Spark  20000    30day      1000
# r2  PySpark  25000   40days      2300
# r3   Hadoop  26000   35days      1200
# r4   Python  22000   40days      2500

To select columns between two column names. The below example selects all columns between 1 and 4 column indexes.


# Select Columns between two Indexes
# Includes Index 1 & Execludes 4
print(df.iloc[:,1:4])

# Outputs:
#      Fee Duration  Discount
# r1  20000    30day      1000
# r2  25000   40days      2300
# r3  26000   35days      1200
# r4  22000   40days      2500
# r5  24000   60days      2000

5. Select Alternate Rows or Columns

Similarly, by using ranges you can also select every alternate row from DataFrame. df.iloc[0:4:2] is using slicing to select alternate rows by index. The slicing notation 0:4:2 selects rows starting from index 0 up to (but not including) index 4, with a step of 2.


# Select Alternate rows By Index
print(df.iloc[0:4:2])

# Outputs:
#   Courses    Fee Duration  Discount
# r1   Spark  20000    30day      1000
# r3  Hadoop  26000   35days      1200

To select alternate columns use df.iloc[:, 1:4:2] is using slicing to select alternate columns between two indexes. The slicing notation 1:4:2 selects columns starting from index 1 up to (but not including) index 4, with a step of 2.


# Select alternate columns between two indexes
print(df.iloc[:,1:4:2])

# Output:
#      Fee  Discount
# r1  20000      1000
# r2  25000      2300
# r3  26000      1200
# r4  22000      2500
# r5  24000      2000

6. Using Conditions with iloc[]

By using iloc[] you can also select rows by conditions from pandas DataFrame. Use df.iloc[list(df['Fee'] >= 24000)] is using a conditional expression to filter rows based on the condition that the ‘Fee’ column should be greater than or equal to 24000.

This program selects rows from the DataFrame where the ‘Fee’ column is greater than or equal to 24000 using boolean indexing. The result is a DataFrame containing only the rows that satisfy the specified condition.


# By Condition
print(df.iloc[list(df['Fee'] >= 24000)])

# Output:
#    Courses    Fee Duration  Discount
# r2  PySpark  25000   40days      2300
# r3   Hadoop  26000   35days      1200
# r5   pandas  24000   60days      2000

7. Complete Example of pandas iloc[]


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
    'Fee' :[20000,25000,26000,22000,24000],
    'Duration':['30day','40days','35days','40days','60days'],
    'Discount':[1000,2300,1200,2500,2000]
              }
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

# Select Single Row by Index
print(df.iloc[1])

# Select Single Column by Index
print(df.iloc[:, 0])

# Select Multiple Rows by Index
print(df.iloc[[1,2]])

# Select Multiple Columns by Index
print(df.iloc[:, [0,1,3]])

# Includes Index 0 & Execludes 4
print(df.iloc[0:4])

# Includes Index 1 & Execludes 4
print(df.iloc[:,1:4])

# Select Alternate rows By Index
print(df.iloc[0:4:2])

# Select Alternate Columns between two Indexes
print(df.iloc[:,1:4:2])

print(df.iloc[list(df['Fee'] >= 24000)])

Frequently Asked Questions on Pandas iloc[]

What is iloc[] in pandas?

In pandas, iloc[] is a method used for integer-location based indexing. It is primarily used to select specific rows and columns in a DataFrame using integer indices, providing a way to access data based on its numerical position.

Can I use negative indices with iloc[]?

You can use negative indices with iloc[] in pandas. Negative indices are interpreted as positions counting from the end of the DataFrame. For example, -1 refers to the last element, -2 refers to the second-to-last element, and so on.

How does iloc[] differ from loc[]?

While iloc[] uses integer-based indexing, loc[] is label-based indexing. iloc[] is used when you want to select data based on numerical positions, whereas loc[] is used when you want to select data based on labels (row or column names).iffer from loc[]?

Can I select a range of rows or columns using iloc[]?

You can select a range of rows or columns using iloc[] in pandas by using slicing. Slicing allows you to specify a range of indices or positions.

How do I select alternate rows or columns with iloc[]?

To select alternate rows or columns using iloc[] in pandas, you can use slicing with a step parameter. The step parameter allows you to specify the interval between selected elements. Here are examples for selecting alternate rows and columns.

Is it possible to select rows based on a condition using iloc[]?

While iloc[] is primarily designed for integer-location based indexing and doesn’t directly support boolean indexing, you can use a boolean array in conjunction with iloc[] to achieve row selection based on a condition.

Can I use iloc[] with both row labels and column labels simultaneously?

You cannot use iloc[] with both row labels and column labels simultaneously. iloc[] is specifically designed for integer-location based indexing, and it expects integer indices for both rows and columns.

Conclusion

In this article, you have learned iloc in pandas is index-based to select rows and/or columns. It accepts a single index, multiple indexes from the list, indexes by a range, and many more.

Happy Learning !!

References

https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.loc.html