• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:12 mins read
You are currently viewing Pandas Iterate Over Columns of DataFrame

Like any other data structure, Pandas DataFrame also has a way to iterate (loop through) over columns and access elements of each column. You can use the for loop to iterate over columns of a DataFrame.

Advertisements

You can use multiple methods to iterate over a Pandas DataFrame like iteritems(), getitem([]), transpose().iterrows(), enumerate(), and NumPy.asarray() function. In this article, I will explain the usage of these methods with examples.

1. Quick Examples of Iterate Over Columns in Pandas DataFrame

If you are in a hurry, below are some quick examples of how to iterate over columns of pandas DataFrame.


# Below are the quick examples.

# Example 1: Use getitem ([]) to iterate over columns
for column in df:
    print(df[column])
    
# Example 2: Use getitem ([]) to iterate over columns in pandas DataFrame
for column in df:
    print(df[column].values)
    
# Example 3: Iterate over columns using DataFrame.iteritems()
for (colname,colval) in df.iteritems():
    print(colname, colval.values)
    
# Example 4: Use iteritems()
for name, values in df.iteritems():
   print('{name}: {value}'.format(name=name, value=values[1])) 
   
# Example 5: Iterate over columns in pandas DataFrame using enumerate()
for (index, colname) in enumerate(df):
    print(index, df[colname].values)
    
# Example 6: Using enumerate()
for (index, column) in enumerate(df):
    print (index, df[column])
    
# Example 7: Using enumerate() & Numpy.asarray()
for (index, column) in enumerate(df):
    print (index, np.asarray(df[column]))
    
# Example 8: Use DataFrame.columns()
for column in df.columns[1:]:
    print(df[column])
    
# Example 9: Iterate over all the columns in reversed order    
for column in df.columns[::-1]:
    print(df[column])
    
# Example 10: Get the indices of all columns
for indix, column in enumerate(df.columns):
    print(indix, column)
    
# Example 11: Use DataFrame.transpose().iterrows()
for (column_name, column) in df.transpose().iterrows():
    print (column_name)

Now, let’s create a DataFrame with a few rows and columns, execute these examples and validate the results. Our DataFrame contains column names Courses, Fee, Duration, and Discount.


import pandas as pd
technologies = [
            ("Spark", 22000,'30days',1000.0),
            ("PySpark",25000,'50days',2300.0),
            ("Hadoop",23000,'55days',1500.0)
            ]
df = pd.DataFrame(technologies,columns = ['Courses','Fee','Duration','Discount'])
print("Create DataFrame:\n", df)

Yields below output.

Pandas iterate over columns

2. Iterate Over DataFrame Columns

One simple way to iterate over columns of pandas DataFrame is by using for loop. You can use column-labels to run the for loop over the pandas DataFrame using the get item syntax([]).


# Use getitem ([]) to iterate over columns
for column in df:
    print(df[column])

Yields below output.

Pandas iterate over columns

The values() function is used to extract the object elements as a list.


# Use getitem ([]) to iterate over columns in pandas DataFrame
for column in df:
    print(df[column].values)

Yields below output.


# Output:
['Spark' 'PySpark' 'Hadoop']
[22000 25000 23000]
['30days' '50days' '55days']
[1000. 2300. 1500.]

3. Iterate Over Columns Using DataFrame.iteritems()

Pandas also provide methods that can be used to iterate over DataFrame columns. For example, the df.iteritems() function iterates over a DataFrame and returns the column name and content as a series. For example,, for(colname,colval) in df.iteritems():.


# Iterate over columns using DataFrame.iteritems()
for (colname,colval) in df.iteritems():
    print(colname, colval.values)

Yields below output.


# Output:
Courses ['Spark' 'PySpark' 'Hadoop']
Fee [22000 25000 23000]
Duration ['30days' '50days' '55days']
Discount [1000. 2300. 1500.]

You can also use iteritems().


# Use iteritems()
for name, values in df.iteritems():
   print('{name}: {value}'.format(name=name, value=values[1])) 

Yields below output.


# Output:
Courses: PySpark
Fee: 25000
Duration: 50days
Discount: 2300.0

4. Iterate Over Columns in DataFrame Using enumerate()

You can also use enumerate() with DataFrame to get the index and column names and use this in a for loop to iterate over each column.


# Iterate over columns in pandas DataFrame using enumerate()
for (index, colname) in enumerate(df):
    print(index, df[colname].values)

Yields below output.


# Output:
0 ['Spark' 'PySpark' 'Hadoop']
1 [22000 25000 23000]
2 ['30days' '50days' '55days']
3 [1000. 2300. 1500.]

if an index corresponding to each column is also desired.


# Using enumerate()
for (index, column) in enumerate(df):
    print (index, df[column])

Yields below output.


# Output:
0 0      Spark
1    PySpark
2     Hadoop
Name: Courses, dtype: object
1 0    22000
1    25000
2    23000
Name: Fee, dtype: int64
2 0    30days
1    50days
2    55days
Name: Duration, dtype: object
3 0    1000.0
1    2300.0
2    1500.0
Name: Discount, dtype: float64

The above code df[column] type is Series, which can be converted into numpy ndarrays. For E.x, np.asarray(df[column]).


# Using enumerate() & Numpy.asarray()
for (index, column) in enumerate(df):
    print (index, np.asarray(df[column]))

Yields below output.


# Output: 
0 ['Spark' 'PySpark' 'Hadoop']
1 [22000 25000 23000]
2 ['30days' '50days' '55days']
3 [1000. 2300. 1500.]

5. Use DataFrame.columns() to Iterate Over Selected Columns

DataFrame.columns() gives a list containing all the column names in the DF. You can use Python’s list slicing to slice DataFrame.columns() according to our needs. For instance, to iterate over all columns but the first one, you can use for column in df.columns[1:]: this syntax.


# Use DataFrame.columns()
for column in df.columns[1:]:
    print(df[column])

Similarly to iterate over all the columns in reversed order, you can use this syntax, for column in df.columns[::-1]: syntax.


# Iterate over all the columns in reversed order    
for column in df.columns[::-1]:
    print(df[column])

Also, remember that you can get the indices of all columns easily using for indix,column in enumerate(df.columns):.


# Get the indices of all columns
for indix, column in enumerate(df.columns):
    print(indix, column)

6. Use DataFrame.transpose().iterrows()

DataFrame.transpose() and iterate over the rows.


# Use DataFrame.transpose().iterrows()
for (column_name, column) in df.transpose().iterrows():
    print (column_name)

Yields below output.


# Output:
Courses
Fee
Duration
Discount

7. Complete Example For Iterate Over Columns in DataFrame


import pandas as pd
technologies = [
            ("Spark", 22000,'30days',1000.0),
            ("PySpark",25000,'50days',2300.0),
            ("Hadoop",23000,'55days',1500.0)
            ]
df = pd.DataFrame(technologies,columns = ['Courses','Fee','Duration','Discount'])
print(df)

# Use getitem ([]) to iterate over columns
for column in df:
    print(df[column])
    
# Use getitem ([]) to iterate over columns in pandas DataFrame
for column in df:
    print(df[column].values)
    
# Iterate over columns using DataFrame.iteritems()
for (colname,colval) in df.iteritems():
    print(colname, colval.values)
    
# Use iteritems()
for name, values in df.iteritems():
   print('{name}: {value}'.format(name=name, value=values[1])) 
   
# Iterate over columns in pandas DataFrame using enumerate()
for (index, colname) in enumerate(df):
    print(index, df[colname].values)
    
# Using enumerate()
for (index, column) in enumerate(df):
    print (index, df[column])
    
# Using enumerate() & Numpy.asarray()
for (index, column) in enumerate(df):
    print (index, np.asarray(df[column]))
    
# Use DataFrame.columns()
for column in df.columns[1:]:
    print(df[column])
    
# Iterate over all the columns in reversed order    
for column in df.columns[::-1]:
    print(df[column])
    
# Get the indices of all columns
for indix, column in enumerate(df.columns):
    print(indix, column)
    
# Use DataFrame.transpose().iterrows()
for (column_name, column) in df.transpose().iterrows():
    print (column_name)

Conclusion

In this article, you have learned how to iterate over columns of pandas DataFrame using DataFrame.iteritems(), get item ([]), enumerate(), DataFrame.columns(), NumPy.asarray(), and DataFrame.transpose().iterrows() methods with more examples.

Happy Learning !!