Like any other data structure, Pandas DataFrame also has a way to iterate (loop through row by row) over rows and access columns/elements of each row. DataFrame provides methods iterrows()
, itertuples()
to iterate over each Row.
Related: 10 Ways to Select Pandas Rows based on DataFrame Column Values
1. Using DataFrame.iterrows() to Iterate Over Rows
pandas DataFrame.iterrows()
is used to iterate over DataFrame rows. This returns (index, Series) where the index is an index of the Row and Series is data or content of each row. To get the data from the series, you should use the column name like row["Fee"]
. To learn more about the Series access How to use Series with Examples.
First, let’s create a DataFrame.
import pandas as pd
technologies = ({
'Courses':["Spark","PySpark","Hadoop","Python","pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
})
df = pd.DataFrame(technologies)
print(df)
Yields below result. As you see the DataFrame has 3 columns Courses
, Fee
and Duration
.
# Output:
Courses Fee Duration
0 Spark 20000 30day
1 PySpark 25000 40days
2 Hadoop 26000 35days
3 Python 22000 40days`
4 pandas 24000 60days
5 Oracle 21000 50days
6 Java 22000 55days
The below example Iterates all rows in a DataFrame using iterrows()
.
# Iterate all rows using DataFrame.iterrows()
for index, row in df.iterrows():
print (index,row["Fee"], row["Courses"])
Yields below output.
# Output:
0 20000 Spark
1 25000 PySpark
2 26000 Hadoop
3 22000 Python
4 24000 Pandas
5 21000 Oracle
6 22000 Java
Let’s see what a row looks like by printing it.
# Row contains the column name and data
row = next(df.iterrows())[1]
print("Data For First Row :")
print(row)
Yields below output.
# Output:
Data For First Row :
Courses Spark
Fee 20000
Duration 30day
Name: 0, dtype: object
Note that Series returned from iterrows()
doesn’t contain the datatype (dtype
), in order to access the data type you should use row["Fee"].dttype
. If you want data type for each row you should use DataFrame.itertuples()
.
Note: Pandas document states that “You should never modify something you are iterating over. This is not guaranteed to work in all cases. Depending on the data types, the iterator returns a copy and not a view, and writing to it will have no effect.”
2. Using DataFrame.itertuples() to Iterate Over Rows
Pandas DataFrame.itertuples()
is the most used method to iterate over rows as it returns all DataFrame elements as an iterator that contains a tuple for each row. itertuples()
is faster compared with iterrows()
and preserves data type.
Below is the syntax of the itertuples()
.
# Syntax DataFrame.itertuples()
DataFrame.itertuples(index=True, name='Pandas')
index
– Defaults to ‘True’. Returns the DataFrame Index as a first element in a tuple. Setting it to False, doens’t return Index.name
– Defaults to ‘Pandas’. You can provide a custom name to your returned tuple.
The below example loop through all elements in a tuple and get the value of each column by using getattr()
.
# Iterate all rows using DataFrame.itertuples()
for row in df.itertuples(index = True):
print (getattr(row,'Index'),getattr(row, "Fee"), getattr(row, "Courses"))
Yields below output.
# Output:
0 20000 Spark
1 25000 PySpark
2 26000 Hadoop
3 22000 Python
4 24000 Pandas
5 21000 Oracle
6 22000 Java
Let’s provide the custom name to the tuple.
# Display one row from iterator
row = next(df.itertuples(index = True,name='Tution'))
print(row)
Yields below output.
# Output:
Tution(Index=0, Courses='Spark', Fee=20000, Duration='30day')
If you set the index parameter to False
, it removes the index as the first element of the tuple.
4. DataFrame.apply() to Iterate
You can also use apply()
method of the DataFrame to loop through the rows by using the lambda function. For more details, refer to DataFrame.apply().
# Syntax of DataFrame.apply()
DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwargs)
Example:
# Another alternate approach by using DataFrame.apply()
print(df.apply(lambda row: str(row["Fee"]) + " " + str(row["Courses"]), axis = 1))
Yields below output.
# Output:
0 20000 Spark
1 25000 PySpark
2 26000 Hadoop
3 22000 Python
4 24000 Pandas
5 21000 Oracle
6 22000 Java
dtype: object
5. Iterating using for & DataFrame.index
You can also loop through rows by using for loop. df['Fee'][0]
returns the first-row value from column Fee
.
# Using DataFrame.index
for idx in df.index:
print(df['Fee'][idx], df['Courses'][idx])
Yields below output.
# Output:
20000 Spark
25000 PySpark
26000 Hadoop
22000 Python
24000 Pandas
21000 Oracle
22000 Java
6. Using for & DataFrame.loc
# Another alternate approach byusing DataFrame.loc()
for i in range(len(df)) :
print(df.loc[i, "Fee"], df.loc[i, "Courses"])
Yields same output as above.
7. Using For & DataFrame.iloc
# Another alternate approach by using DataFrame.iloc()
for i in range(len(df)) :
print(df.iloc[i, 0], df.iloc[i, 2])
Yields below output.
# Output:
Spark 30day
PySpark 40days
Hadoop 35days
Python 40days
Pandas 60days
Oracle 50days
Java 55days
8. Using DataFrame.items() to Iterate Over Columns
DataFrame.items() are used to iterate over columns (column by column) of pandas DataFrame. This returns a tuple (column name, Series) with the name and the content as Series.
The first value in the returned tuple contains the column label name and the second contains the content/data of DataFrame as a series.
# Iterate over column by column using DataFrame.items()
for label, content in df.items():
print(f'label: {label}')
print(f'content: {content}', sep='\n')
Yields below output.
# Output:
label: Courses
content: 0 Spark
1 PySpark
2 Hadoop
3 Python
4 Pandas
5 Oracle
6 Java
Name: Courses, dtype: object
label: Fee
content: 0 20000
1 25000
2 26000
3 22000
4 24000
5 21000
6 22000
Name: Fee, dtype: int64
label: Duration
content: 0 30day
1 40days
2 35days
3 40days
4 60days
5 50days
6 55days
Name: Duration, dtype: object
9. Performance of Iterating DataFrame
Iterating a DataFrame is not advised or recommended to use as the performance would be very bad when iterating over the large dataset. Make sure you use this only when you exhausted all other options. Before using examples mentioned in this article, check if you can use any of these 1) Vectorization, 2) Cython routines, 3) List Comprehensions (vanilla for
loop).

10. Complete Example of pandas Iterate over Rows
import pandas as pd
Technologys = ({
'Courses':["Spark","PySpark","Hadoop","Python","Pandas","Oracle","Java"],
'Fee' :[20000,25000,26000,22000,24000,21000,22000],
'Duration':['30day', '40days' ,'35days', '40days', '60days', '50days', '55days']
})
df = pd.DataFrame(Technologys)
print(df)
# Using DataFrame.iterrows()
row = next(df.iterrows())[1]
print("Data For First Row :")
print(row)
for index, row in df.iterrows():
print (index,row["Fee"], row["Courses"])
# Using DataFrame.itertuples()
row = next(df.itertuples(index = True, name='Tution'))
print("Data For First Row :")
print(row)
for row in df.itertuples(index = True):
print (getattr(row,'Index'),getattr(row, "Fee"), getattr(row, "Courses"))
# Another alternate approach by using DataFrame.apply
print(df.apply(lambda row: str(row["Fee"]) + " " + str(row["Courses"]), axis = 1))
# Using DataFrame.index
for idx in df.index:
print(df['Fee'][idx], df['Courses'][idx])
# Another alternate approach by using DataFrame.loc
for i in range(len(df)) :
print(df.loc[i, "Fee"], df.loc[i, "Courses"])
# Another alternate approach by using DataFrame.iloc
for i in range(len(df)) :
print(df.iloc[i, 0], df.iloc[i, 2])
# Using DataFrame.items
for label, content in df.items():
print(f'label: {label}')
print(f'content: {content}', sep='\n')
Conclusion
DataFrame provides several methods to iterate over rows (loop over row by row) and access columns/cells. But it is not recommended to manually loop over the rows as it degrades the performance of the application when used on large datasets. Each example explained in this article behaves differently so depending on your use-case use the one that suits your need.
Happy Learning !!
Related Articles
- Different Ways to Rename Pandas DataFrame Column Names
- How to transform or remap Pandas DataFrame column values with Dict
- Pandas – Get All Column Names as List from DataFrame
- Pandas Iterate Over Series
- Pandas Iterate Over Columns of DataFrame
- How to Convert Pandas DataFrame to List?
- Append Pandas DataFrames Using for Loop
- Pandas Series apply() Function Usage
- Pandas Get First Column of DataFrame as Series?