• Post author:
  • Post category:Pandas
  • Post last modified:April 15, 2024
  • Reading time:16 mins read
You are currently viewing Pandas Concatenate Two Columns

How to concatenate two/multiple columns of Pandas DataFrame? You can use various methods, including the + operator and several Pandas functions. This operation is often performed in data manipulation and analysis to merge or combine information from two different columns into a single column.

In this article, I will cover the most used ways in my real-time projects to concatenate two or multiple columns of string/text type. While concat based on your need, you may be required to add a separator; hence, I will explain examples with the separator as well.

Key Points –

  • Pandas offers versatile methods like .str.cat() and DataFrame.agg() to efficiently concatenate two columns.
  • The + operator can be used directly for concatenating string columns in Pandas, but it performs addition for numeric columns.
  • Additionally, the DataFrame.apply() method can be used with custom functions to concatenate columns along specified axes.
  • Selecting the appropriate concatenation method depends on factors such as data type, performance considerations, and specific concatenation requirements

Quick Examples of Concatenate Two Columns

If you are in a hurry, below are some quick examples of how to concatenate two columns in Pandas DataFrame.


# Quick examples of concatenate two columns 

# Example 1: Using + operator to combine two columns
df["Period"] = df['Courses'].astype(str) +"-"+ df["Duration"]

# Example 2: Using apply() method to combine two columns of text
df["Period"] = df[["Courses", "Duration"]].apply("-".join, axis=1)

# Example 3: Using DataFrame.agg() to combine two columns of text
df["period"] = df[['Courses', 'Duration']].agg('-'.join, axis=1)

# Example 4: Using Series.str.cat() function
df["Period"] = df["Courses"].str.cat(df["Duration"], sep="-")

# Example 5: Using DataFrame.apply() and lambda function
df["Period"] = df[["Courses", "Duration"]].apply(lambda x: "-".join(x), axis =1)

# Example 6: Using map() function to combine two columns of text
df["Period"] = df["Courses"].map(str) + "-" + df["Duration"]

Now, let’s create a DataFrame with some rows and columns and then execute the examples provided to validate the results.


# Create DataFrame
import pandas as pd
technologies = ({
     'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
     'Fee' :[20000,25000,26000,22000,24000],
     'Duration':['30days','40days','35days','40days','60days'],
     'Discount':[1000,1500,2500,2100,2000]
               })
df = pd.DataFrame(technologies)
print("DataFrame:\n", df)

Yields below output.

Pandas concatenate two columns

Concatenate Two Columns Using + Operator in Pandas

You can use the + operator to concatenate two or more string/text columns in a Pandas DataFrame. Note that When the + operator is applied to numeric columns in a DataFrame, it performs arithmetic addition instead of string concatenation.


# Using + operator to combine two columns
df["Period"] = df['Courses'].astype(str) +"-"+ df["Duration"]
print("After concatenating the two DataFrames:\n", df)

In the above example, use the + operator to concatenate the Courses and Duration columns, and then store the result in a new column called Period. This example yields the below output.

Pandas concatenate two columns

Using the apply() Function

You can consolidate two or more columns of a DataFrame into a single column efficiently using the DataFrame.apply() function. This function is used to apply a function on a specific axis. When you concatenate two string columns using the apply() method, you can use a join() function to join this.


# Using apply() method to combine two columns of text
df["Period"] = df[["Courses", "Duration"]].apply("-".join, axis=1)
print("After concatenating the two DataFrames:\n", df)

Yields the same output as above.

Using agg() to Concat String Columns of DataFrame

To concatenate multiple string columns, you can utilize the df.agg() method. Similar to the previous code, you can pass all the columns you want to concatenate as a list. Then apply the agg() method along with the join() function and get the desired output.


# Using DataFrame.agg() to combine two columns of text
df["period"] = df[['Courses', 'Duration']].agg('-'.join, axis=1)
print("After concatenating the two DataFrames:\n", df)

Yields the same output as above.

Using Series.str.cat() Function to Concat Columns

The Series.str.cat() function efficiently concatenates two Series with a delimiter/separator. You can certainly apply this to a DataFrame by using DataFrame columns, which return Series objects.


# Using Series.str.cat() function 
df["Period"] = df["Courses"].str.cat(df["Duration"], sep = "-")
print("After concatenating the two DataFrames:\n", df)

Yields the same output as above.

Using DataFrame.apply() and Lambda Function to Concat

The apply() method with a lambda function can be used to achieve the same result. You can generalize this method to concatenate an arbitrary number of string columns by replacing df[["Courses", "Duration"]] with any column slice of your DataFrame.


# Using DataFrame.apply() and lambda function
df["Period"] = df[["Courses", "Duration"]].apply(lambda x: " ".join(x), axis =1)
print("After concatenating the two DataFrames:\n", df)

Yields the same output as above.

Concat Two Columns Using map() Function

You can utilize the map() function to concatenate multiple columns, offering greater flexibility, including the ability to apply custom logic or conditions as needed.


# Using map() function to combine two columns of text
df["Period"] = df["Courses"].map(str) + " " + df["Duration"]
print("After concatenating the two DataFrames:\n", df)

Yields the same output as above.

Complete Example of Concatenate Two Columns in Pandas

Below is a complete example of how to concat two or multiple columns on Pandas DataFrame.


import pandas as pd
technologies = ({
     'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
     'Fee' :[20000,25000,26000,22000,24000],
     'Duration':['30days','40days','35days','40days','60days'],
     'Discount':[1000,1500,2500,2100,2000]
               })
df = pd.DataFrame(technologies)
print(df)

# Using + operator to combine two columns
df["Period"] = df['Courses'].astype(str) +"-"+ df["Duration"]
print(df)

# Using apply() method to combine two columns of text
df["Period"] = df[["Courses", "Duration"]].apply("-".join, axis=1)
print(df)

# Using DataFrame.agg() to combine two columns of text
df["period"] = df[['Courses', 'Duration']].agg('-'.join, axis=1)
print(df)

# Using Series.str.cat() function
df["Period"] = df["Courses"].str.cat(df["Duration"], sep = "-")
print(df)

# Using DataFrame.apply() and lambda function
df["Period"] = df[["Courses", "Duration"]].apply(lambda x: "-".join(x), axis =1)
print(df)

# Using map() function to combine two columns of text
df["Period"] = df["Courses"].map(str) + "-" + df["Duration"]
print(df)

FAQ on Concatenate Two DataFrame Columns

What does it mean to concatenate two DataFrame columns?

Concatenating two DataFrame columns means combining the data from two separate columns in a DataFrame to create a new column.

How can I concatenate two columns in a pandas DataFrame?

You can concatenate two columns in a pandas DataFrame using various methods such as the + operator, the apply() method, the str.cat() function, or the map() function.

Can I concatenate two columns using the + operator in pandas?

You can concatenate two columns in a Pandas DataFrame using the + operator. When you use the + operator between two columns, Pandas performs element-wise addition for numeric columns and concatenation for string/text columns.

How do I concatenate multiple columns in pandas?

To concatenate multiple columns in pandas, you can use methods like apply() with a lambda function, str.cat() function, or the map() function. These methods allow you to concatenate multiple columns efficiently.

Is there a way to concatenate columns while checking conditions in pandas?

You can use methods like apply() with a lambda function or the map() function to concatenate columns while also incorporating conditions or custom logic as needed

Conclusion

In this article, I have explained how to concatenate two columns in pandas DataFrame using + operator, apply() method, str.cat() function, map() function, or agg() method.

Happy Learning !!

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium