Pandas iloc[] Usage with Examples

  • Post author:
  • Post category:Pandas
  • Post last modified:November 29, 2023

pandas.DataFrame.iloc[] is a property that is used to select rows and columns by position/index. If the position/index does not exist, it gives an index error. In this article, I will cover usage and examples of pandas iloc.

Pandas DataFrame is a two-dimensional tabular data structure with labeled axes. i.e. columns and rows. Selecting columns from DataFrame results in a new DataFrame containing only specified selected columns from the original DataFrame.

pandas loc[] is another property that is used to operate on the column and row labels. For a better understanding of these two learn the differences and similarities between pandas loc[] vs iloc[]. The difference between loc[] vs iloc[] is described by how you select rows and columns from pandas DataFrame.

1 pandas.DataFrame.iloc[] Syntax & Usage

DataFrame.iloc[] is index-based to select rows and/or columns in pandas. It accepts a single index, multiple indexes from the list, indexes by a range, and many more.

One of the main advantages of DataFrame is its ease of use. You can see this yourself when you use loc[] or iloc[] attributes to select or filter DataFrame rows or columns. These are mostly used attributes in DataFrame.

pandas iloc
  • START is the integer index of the row/column.
  • STOP is the integer index of the last row/column where you wanted to stop the selection, and 
  • STEP as the number of indices to advance after each extraction.

Some point to note about iloc[].

  • By not providing a start index, iloc[] selects from the first row/column.
  • By not providing stop, iloc[] selects all rows/columns from the start index.
  • Providing both start and stop, selects all rows/columns in between.

Let’s create a DataFrame and run some examples of pandas iloc.


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
    'Fee' :[20000,25000,26000,22000,24000],
    'Duration':['30day','40days','35days','40days','60days'],
    'Discount':[1000,2300,1200,2500,2000]
              }
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

# Outputs:
# r1    Spark  20000    30day      1000
# r2  PySpark  25000   40days      2300
# r3   Hadoop  26000   35days      1200
# r4   Python  22000   40days      2500
# r5   pandas  24000   60days      2000

2. Select Single Row & Column By Index

Using iloc[] you can select the single row and column by index. The below example demonstrates how to select row by index.


# Select Single Row by Index
print(df.iloc[1])

# Outputs:
# Courses     PySpark
# Fee           25000
# Duration     40days
# Discount       2300
# Name: r2, dtype: object

In order to select a column by Index, use the below.


# Select Single Column by Index
print(df.iloc[:, 0])

# Outputs:
#    Courses
# r1    Spark
# r2  PySpark
# r3   Hadoop
# r4   Python
# r5   pandas

3. Select Multiple Rows & Columns by Index

To select multiple rows and columns, use the integer index as a list to iloc[] attribute. Below is an example of how to select rows by index.


# Select Multiple Rows by Index
print(df.iloc[[1,2]])

# Outputs:
#    Courses    Fee Duration  Discount
# r2  PySpark  25000   40days      2300
# r3   Hadoop  26000   35days      1200

Similarly to select multiple columns from pandas DataFrame.


# Select Multiple Columns by Index
print(df.iloc[:, [0,1,3]])

# Outputs:
#    Courses    Fee  Discount
# r1    Spark  20000      1000
# r2  PySpark  25000      2300
# r3   Hadoop  26000      1200
# r4   Python  22000      2500
# r5   pandas  24000      2000

4. Select Rows or Columns by Index Range

By using iloc[], you can also select rows and columns by range. For example all items between two rows/columns. all items starting from e.t.c. The below example selects rows in between 0 and 4 row indices.


# Select Rows Between two Indexs
# Includes Index 0 & Execludes 4
print(df.iloc[0:4])

# Outputs:
#    Courses    Fee Duration  Discount
# r1    Spark  20000    30day      1000
# r2  PySpark  25000   40days      2300
# r3   Hadoop  26000   35days      1200
# r4   Python  22000   40days      2500

To select columns between two column names. The below example selects all columns between 1 and 4 column indexes.


# Select Columns between two Indexes
# Includes Index 1 & Execludes 4
print(df.iloc[:,1:4])

# Outputs:
#      Fee Duration  Discount
# r1  20000    30day      1000
# r2  25000   40days      2300
# r3  26000   35days      1200
# r4  22000   40days      2500
# r5  24000   60days      2000

5. Select Alternate Rows or Columns

Similarly, by using ranges you can also select every alternate row from DataFrame.


# Select Alternate rows By Index
print(df.iloc[0:4:2])

# Outputs:
#   Courses    Fee Duration  Discount
# r1   Spark  20000    30day      1000
# r3  Hadoop  26000   35days      1200

To select alternate columns use


# Select Alternate Columns between two Indexes
print(df.iloc[:,1:4:2])

# Output:
#      Fee  Discount
# r1  20000      1000
# r2  25000      2300
# r3  26000      1200
# r4  22000      2500
# r5  24000      2000

6. Using Conditions with iloc[]

By using iloc[] you can also select rows by conditions from pandas DataFrame.


# By Condition
print(df.iloc[list(df['Fee'] >= 24000)])

# Output:
#    Courses    Fee Duration  Discount
# r2  PySpark  25000   40days      2300
# r3   Hadoop  26000   35days      1200
# r5   pandas  24000   60days      2000

7. pandas iloc[] Complete Example


import pandas as pd
technologies = {
    'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
    'Fee' :[20000,25000,26000,22000,24000],
    'Duration':['30day','40days','35days','40days','60days'],
    'Discount':[1000,2300,1200,2500,2000]
              }
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(technologies,index=index_labels)
print(df)

# Select Single Row by Index
print(df.iloc[1])

# Select Single Column by Index
print(df.iloc[:, 0])

# Select Multiple Rows by Index
print(df.iloc[[1,2]])

# Select Multiple Columns by Index
print(df.iloc[:, [0,1,3]])

# Includes Index 0 & Execludes 4
print(df.iloc[0:4])

# Includes Index 1 & Execludes 4
print(df.iloc[:,1:4])

# Select Alternate rows By Index
print(df.iloc[0:4:2])

# Select Alternate Columns between two Indexes
print(df.iloc[:,1:4:2])

print(df.iloc[list(df['Fee'] >= 24000)])

Conclusion

In this article, you have learned iloc in pandas is index-based to select rows and/or columns. It accepts a single index, multiple indexes from the list, indexes by a range, and many more.

Happy Learning !!

Related Articles

References

Naveen

I am a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, I have honed my expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. My journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. I have started this SparkByExamples.com to share my experiences with the data as I come across. You can learn more about me at LinkedIn

Leave a Reply

You are currently viewing Pandas iloc[] Usage with Examples