Like any other data structure, Pandas DataFrame also has a way to iterate (loop through) over columns and access elements of each column. You can use the for
loop to iterate over columns of a DataFrame.
You can use multiple methods to iterate over a Pandas DataFrame like iteritems()
, getitem([])
, transpose().iterrows()
, enumerate()
, and NumPy.asarray()
function. In this article, I will explain the usage of these methods with examples.
1. Quick Examples of Iterate Over Columns in Pandas DataFrame
If you are in a hurry, below are some quick examples of how to iterate over columns of pandas DataFrame.
# Below are the quick examples.
# Example 1: Use getitem ([]) to iterate over columns
for column in df:
print(df[column])
# Example 2: Use getitem ([]) to iterate over columns in pandas DataFrame
for column in df:
print(df[column].values)
# Example 3: Iterate over columns using DataFrame.iteritems()
for (colname,colval) in df.iteritems():
print(colname, colval.values)
# Example 4: Use iteritems()
for name, values in df.iteritems():
print('{name}: {value}'.format(name=name, value=values[1]))
# Example 5: Iterate over columns in pandas DataFrame using enumerate()
for (index, colname) in enumerate(df):
print(index, df[colname].values)
# Example 6: Using enumerate()
for (index, column) in enumerate(df):
print (index, df[column])
# Example 7: Using enumerate() & Numpy.asarray()
for (index, column) in enumerate(df):
print (index, np.asarray(df[column]))
# Example 8: Use DataFrame.columns()
for column in df.columns[1:]:
print(df[column])
# Example 9: Iterate over all the columns in reversed order
for column in df.columns[::-1]:
print(df[column])
# Example 10: Get the indices of all columns
for indix, column in enumerate(df.columns):
print(indix, column)
# Example 11: Use DataFrame.transpose().iterrows()
for (column_name, column) in df.transpose().iterrows():
print (column_name)
Now, let’s create a DataFrame with a few rows and columns, execute these examples and validate the results. Our DataFrame contains column names Courses
, Fee
, Duration
, and Discount
.
import pandas as pd
technologies = [
("Spark", 22000,'30days',1000.0),
("PySpark",25000,'50days',2300.0),
("Hadoop",23000,'55days',1500.0)
]
df = pd.DataFrame(technologies,columns = ['Courses','Fee','Duration','Discount'])
print("Create DataFrame:\n", df)
Yields below output.
2. Iterate Over DataFrame Columns
One simple way to iterate over columns of pandas DataFrame is by using for loop. You can use column-labels to run the for
loop over the pandas DataFrame using the get item syntax([])
.
# Use getitem ([]) to iterate over columns
for column in df:
print(df[column])
Yields below output.
The values()
function is used to extract the object elements as a list.
# Use getitem ([]) to iterate over columns in pandas DataFrame
for column in df:
print(df[column].values)
Yields below output.
# Output:
['Spark' 'PySpark' 'Hadoop']
[22000 25000 23000]
['30days' '50days' '55days']
[1000. 2300. 1500.]
3. Iterate Over Columns Using DataFrame.iteritems()
Pandas also provide methods that can be used to iterate over DataFrame columns. For example, the df.iteritems() function iterates over a DataFrame and returns the column name and content as a series. For example,, for(colname,colval) in df.iteritems():
.
# Iterate over columns using DataFrame.iteritems()
for (colname,colval) in df.iteritems():
print(colname, colval.values)
Yields below output.
# Output:
Courses ['Spark' 'PySpark' 'Hadoop']
Fee [22000 25000 23000]
Duration ['30days' '50days' '55days']
Discount [1000. 2300. 1500.]
You can also use iteritems()
.
# Use iteritems()
for name, values in df.iteritems():
print('{name}: {value}'.format(name=name, value=values[1]))
Yields below output.
# Output:
Courses: PySpark
Fee: 25000
Duration: 50days
Discount: 2300.0
4. Iterate Over Columns in DataFrame Using enumerate()
You can also use enumerate()
with DataFrame to get the index and column names and use this in a for loop to iterate over each column.
# Iterate over columns in pandas DataFrame using enumerate()
for (index, colname) in enumerate(df):
print(index, df[colname].values)
Yields below output.
# Output:
0 ['Spark' 'PySpark' 'Hadoop']
1 [22000 25000 23000]
2 ['30days' '50days' '55days']
3 [1000. 2300. 1500.]
if an index corresponding to each column is also desired.
# Using enumerate()
for (index, column) in enumerate(df):
print (index, df[column])
Yields below output.
# Output:
0 0 Spark
1 PySpark
2 Hadoop
Name: Courses, dtype: object
1 0 22000
1 25000
2 23000
Name: Fee, dtype: int64
2 0 30days
1 50days
2 55days
Name: Duration, dtype: object
3 0 1000.0
1 2300.0
2 1500.0
Name: Discount, dtype: float64
The above code df[column
] type is Series, which can be converted into numpy ndarrays
. For E.x, np.asarray(df[column])
.
# Using enumerate() & Numpy.asarray()
for (index, column) in enumerate(df):
print (index, np.asarray(df[column]))
Yields below output.
# Output:
0 ['Spark' 'PySpark' 'Hadoop']
1 [22000 25000 23000]
2 ['30days' '50days' '55days']
3 [1000. 2300. 1500.]
5. Use DataFrame.columns() to Iterate Over Selected Columns
DataFrame.columns()
gives a list containing all the column names in the DF
. You can use Python’s list slicing to slice DataFrame.columns() according to our needs. For instance, to iterate over all columns but the first one, you can use for column in df.columns[1:]:
this syntax.
# Use DataFrame.columns()
for column in df.columns[1:]:
print(df[column])
Similarly to iterate over all the columns in reversed order, you can use this syntax, for column in df.columns[::-1]:
syntax.
# Iterate over all the columns in reversed order
for column in df.columns[::-1]:
print(df[column])
Also, remember that you can get the indices of all columns easily using for indix,column in enumerate(df.columns):
.
# Get the indices of all columns
for indix, column in enumerate(df.columns):
print(indix, column)
6. Use DataFrame.transpose().iterrows()
DataFrame.transpose()
and iterate over the rows.
# Use DataFrame.transpose().iterrows()
for (column_name, column) in df.transpose().iterrows():
print (column_name)
Yields below output.
# Output:
Courses
Fee
Duration
Discount
7. Complete Example For Iterate Over Columns in DataFrame
import pandas as pd
technologies = [
("Spark", 22000,'30days',1000.0),
("PySpark",25000,'50days',2300.0),
("Hadoop",23000,'55days',1500.0)
]
df = pd.DataFrame(technologies,columns = ['Courses','Fee','Duration','Discount'])
print(df)
# Use getitem ([]) to iterate over columns
for column in df:
print(df[column])
# Use getitem ([]) to iterate over columns in pandas DataFrame
for column in df:
print(df[column].values)
# Iterate over columns using DataFrame.iteritems()
for (colname,colval) in df.iteritems():
print(colname, colval.values)
# Use iteritems()
for name, values in df.iteritems():
print('{name}: {value}'.format(name=name, value=values[1]))
# Iterate over columns in pandas DataFrame using enumerate()
for (index, colname) in enumerate(df):
print(index, df[colname].values)
# Using enumerate()
for (index, column) in enumerate(df):
print (index, df[column])
# Using enumerate() & Numpy.asarray()
for (index, column) in enumerate(df):
print (index, np.asarray(df[column]))
# Use DataFrame.columns()
for column in df.columns[1:]:
print(df[column])
# Iterate over all the columns in reversed order
for column in df.columns[::-1]:
print(df[column])
# Get the indices of all columns
for indix, column in enumerate(df.columns):
print(indix, column)
# Use DataFrame.transpose().iterrows()
for (column_name, column) in df.transpose().iterrows():
print (column_name)
Conclusion
In this article, you have learned how to iterate over columns of pandas DataFrame using DataFrame.iteritems()
, get item ([])
, enumerate()
, DataFrame.columns()
, NumPy.asarray()
, and DataFrame.transpose().iterrows()
methods with more examples.
Happy Learning !!
Related Articles
- Filter Rows with NAN Value from Pandas DataFrame Column
- Create Test and Train Samples from Pandas DataFrame
- How to Print Pandas DataFrame without Index
- Rename Index Values of Pandas DataFrame
- Rename Index of Pandas DataFrame
- Pandas Iterate Over Series
- Pandas Iterate Over Rows with Examples
- Pandas Add Column based on Another Column
- Pandas Sum DataFrame Rows With Examples
- Pandas Sort by Column Values DataFrame