• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:10 mins read
You are currently viewing Pandas apply() Return Multiple Columns

In Pandas, the apply() function can indeed be used to return multiple columns by returning a pandas Series or DataFrame from the applied function. In this article, I will explain how to return multiple columns from the pandas apply() function.

Key Points –

  • apply() allows for the application of custom transformations to DataFrame rows or columns, enabling complex data manipulations tailored to specific needs.
  • By returning multiple columns from the applied function, apply() facilitates the aggregation of data from multiple sources or the creation of derived features in a concise and efficient manner.
  • apply() can be less efficient compared to vectorized operations, especially for large datasets. Consider alternative methods for better performance if working with big data.
  • Useful for complex transformations or calculations that involve multiple columns and cannot be easily expressed using built-in Pandas functions or methods.

Quick Examples of apply() Return Multiple Columns

If you are in a hurry, below are some quick examples of how to apply() return multiple columns.


# Example 1 - Reurn multiple columns from apply()
def multiply(row):
   row['A1'] = row[0] * 2
   row['B1'] = row[1] * 3
   row['C1'] = row[2] * 4
   return row

df = df.apply(multiply, axis=1)
print(df)

Let’s create a sample DataFrame to work with some examples.


import pandas as pd
import numpy as np
data = [(3,5,7), (2,4,6),(5,8,9)]
df = pd.DataFrame(data, columns = ['A','B','C'])
print(df)

Yields below output.


# Output:
   A  B  C
0  3  5  7
1  2  4  6
2  5  8  9

Return Multiple Columns from pandas apply()

You can return a Series from the apply() function that contains the new data. pass axis=1 to the apply() function which applies the function multiply to each row of the DataFrame, Returns a series of multiple columns from pandas apply() function. This series, row, contains the new values, as well as the original data.


# Reurn multiple columns from apply()
def multiply(row):
   row['A1'] = row[0] * 2
   row['B1'] = row[1] * 3
   row['C1'] = row[2] * 4
   return row

df = df.apply(multiply, axis=1)
print(df)

Yields below output. This returns multiple columns ‘A1’, ‘B1’ and ‘C1’ from pandas apply() function.


# Output:
   A  B  C  A1  B1  C1
0  3  5  7   6  15  28
1  2  4  6   4  12  24
2  5  8  9  10  24  36

Frequently Asked Questions on Pandas apply() Return Multiple Columns

How can I use apply() to return multiple columns in Pandas?

You can use the apply() function along with a custom function that returns a pandas Series or DataFrame. This function should perform the desired calculations or transformations on the input data and return the results as a Series or DataFrame with multiple columns.

What is the advantage of using apply() to return multiple columns?

apply() offers flexibility in data manipulation by allowing you to apply custom transformations that may involve multiple columns. This enables you to perform complex operations that are not easily achieved with built-in Pandas functions or methods.

Are there alternatives to apply() for returning multiple columns in Pandas?

Alternatives include using vectorized operations, list comprehensions, or the pd.DataFrame.transform() method, depending on the specific data manipulation task and performance requirements.

Can I use apply() to perform calculations across multiple rows or columns simultaneously?

You can use apply() with appropriate axis parameters (axis=0 for columns, axis=1 for rows) to perform calculations across multiple rows or columns simultaneously. This can be useful for various data aggregation or transformation tasks.

How can I ensure proper alignment of index when using apply() to return multiple columns?

Ensure that the index of the returned Series or DataFrame matches the index of the original DataFrame. This ensures that the new columns are correctly aligned with the existing data in the DataFrame.

Conclusion

In this article, I have explained the Pandas apply() function provides a powerful tool for returning multiple columns based on custom transformations applied to DataFrame rows or columns. This capability enhances flexibility in data manipulation, enabling the creation of complex derived features or the aggregation of data from multiple sources efficiently with examples.

Happy Learning !!

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium