• Post author:
  • Post category:Pandas
  • Post last modified:May 18, 2024
  • Reading time:14 mins read
You are currently viewing Pandas Concatenate Two Columns

How to concatenate two/multiple columns of Pandas DataFrame? You can use various methods, including the + operator and several Pandas functions. This operation is often performed in data manipulation and analysis to merge or combine information from two different columns into a single column.

Advertisements

In this article, I will cover the most used ways in my real-time projects to concatenate two or multiple columns of string/text type. While concat based on your need, you may be required to add a separator; hence, I will explain examples with the separator as well.

Key Points –

  • Pandas offers versatile methods like .str.cat() and DataFrame.agg() to efficiently concatenate two columns.
  • The + operator can be used directly for concatenating string columns in Pandas, but it performs addition for numeric columns.
  • Additionally, the DataFrame.apply() method can be used with custom functions to concatenate columns along specified axes.
  • Selecting the appropriate concatenation method depends on factors such as data type, performance considerations, and specific concatenation requirements

Quick Examples of Concatenate Two Columns

Following are quick examples of concatenating two columns in DataFrame.


# Quick examples of concatenate two columns 

# Example 1: Using + operator to combine two columns
df["Period"] = df['Courses'].astype(str) +"-"+ df["Duration"]

# Example 2: Using apply() method to combine two columns of text
df["Period"] = df[["Courses", "Duration"]].apply("-".join, axis=1)

# Example 3: Using DataFrame.agg() to combine two columns of text
df["period"] = df[['Courses', 'Duration']].agg('-'.join, axis=1)

# Example 4: Using Series.str.cat() function
df["Period"] = df["Courses"].str.cat(df["Duration"], sep="-")

# Example 5: Using DataFrame.apply() and lambda function
df["Period"] = df[["Courses", "Duration"]].apply(lambda x: "-".join(x), axis =1)

# Example 6: Using map() function to combine two columns of text
df["Period"] = df["Courses"].map(str) + "-" + df["Duration"]

To run some examples of concatenating two columns in Pandas DataFrame, let’s create Pandas DataFrame using data from a dictionary.


# Create DataFrame
import pandas as pd
technologies = ({
     'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
     'Fee' :[20000,25000,26000,22000,24000],
     'Duration':['30days','40days','35days','40days','60days'],
     'Discount':[1000,1500,2500,2100,2000]
               })
df = pd.DataFrame(technologies)
print("DataFrame:\n", df)

Yields below output.

Pandas concatenate two columns

Using + Operator to Concatenate Two Columns

In a Pandas DataFrame, the + operator concatenates two or more string/text columns, combining their values element-wise. However, it’s important to note that when applied to numeric columns, the + operator performs arithmetic addition rather than string concatenation.


# Using + operator to combine two columns
df["Period"] = df['Courses'].astype(str) +"-"+ df["Duration"]
print("After concatenating the two DataFrames:\n", df)

In this code, use the + operator to concatenate the Courses and Duration columns, and then store the result in a new column called Period. This example yields the below output.

Pandas concatenate two columns

Using the apply() Function

You can consolidate two or more columns of a DataFrame into a single column efficiently using the DataFrame.apply() function. This function is used to apply a function on a specific axis. When you concatenate two string columns using the apply() method, you can use a join() function to join this.


# Using apply() method to combine two columns of text
df["Period"] = df[["Courses", "Duration"]].apply("-".join, axis=1)
print("After concatenating the two DataFrames:\n", df)

Yields the same output as above.

Using agg() to Concat String Columns of DataFrame

To concatenate multiple string columns, you can utilize the df.agg() method. Similar to the previous code, you can pass all the columns you want to concatenate as a list. Then apply the agg() method along with the join() function and get the desired output.


# Using DataFrame.agg() to combine two columns of text
df["period"] = df[['Courses', 'Duration']].agg('-'.join, axis=1)
print("After concatenating the two DataFrames:\n", df)

Yields the same output as above.

Using Series.str.cat() Function to Concat Columns

The Series.str.cat() function efficiently concatenates two Series with a delimiter/separator. You can certainly apply this to a DataFrame by using DataFrame columns, which return Series objects.


# Using Series.str.cat() function 
df["Period"] = df["Courses"].str.cat(df["Duration"], sep = "-")
print("After concatenating the two DataFrames:\n", df)

Yields the same output as above.

Using apply() & Lambda

The apply() method, combined with a lambda function, offers a versatile approach to achieve similar concatenation results. By replacing df[[Courses, Duration]] with any column slice of your DataFrame, this method can be generalized to concatenate an arbitrary number of string columns.


# Using DataFrame.apply() and lambda function
df["Period"] = df[["Courses", "Duration"]].apply(lambda x: " ".join(x), axis =1)
print("After concatenating the two DataFrames:\n", df)

Yields the same output as above.

Concat Two Columns Using map() Function

You can utilize the map() function to concatenate multiple columns, offering greater flexibility, including the ability to apply custom logic or conditions as needed.


# Using map() function to combine two columns of text
df["Period"] = df["Courses"].map(str) + " " + df["Duration"]
print("After concatenating the two DataFrames:\n", df)

Yields the same output as above.

Complete Example of Concatenate Two Columns in Pandas

Below is a complete example of how to concat two or multiple columns on Pandas DataFrame.


import pandas as pd
technologies = ({
     'Courses':["Spark","PySpark","Hadoop","Python","pandas"],
     'Fee' :[20000,25000,26000,22000,24000],
     'Duration':['30days','40days','35days','40days','60days'],
     'Discount':[1000,1500,2500,2100,2000]
               })
df = pd.DataFrame(technologies)
print(df)

# Using + operator to combine two columns
df["Period"] = df['Courses'].astype(str) +"-"+ df["Duration"]
print(df)

# Using apply() method to combine two columns of text
df["Period"] = df[["Courses", "Duration"]].apply("-".join, axis=1)
print(df)

# Using DataFrame.agg() to combine two columns of text
df["period"] = df[['Courses', 'Duration']].agg('-'.join, axis=1)
print(df)

# Using Series.str.cat() function
df["Period"] = df["Courses"].str.cat(df["Duration"], sep = "-")
print(df)

# Using DataFrame.apply() and lambda function
df["Period"] = df[["Courses", "Duration"]].apply(lambda x: "-".join(x), axis =1)
print(df)

# Using map() function to combine two columns of text
df["Period"] = df["Courses"].map(str) + "-" + df["Duration"]
print(df)

FAQ on Concatenate Two DataFrame Columns

What does it mean to concatenate two DataFrame columns?

Concatenating two DataFrame columns means combining the data from two separate columns in a DataFrame to create a new column.

Can I concatenate two columns using the + operator in pandas?

You can concatenate two columns in a Pandas DataFrame using the + operator. When you use the + operator between two columns, Pandas performs element-wise addition for numeric columns and concatenation for string/text columns.

How do I concatenate multiple columns in pandas?

To concatenate multiple columns in pandas, you can use methods like apply() with a lambda function, str.cat() function, or the map() function. These methods allow you to concatenate multiple columns efficiently.

Is there a way to concatenate columns while checking conditions in pandas?

You can use methods like apply() with a lambda function or the map() function to concatenate columns while also incorporating conditions or custom logic as needed

Conclusion

In summary, concatenating two columns in a Pandas DataFrame by using + operator, apply() method, str.cat() function, map() function, or agg() function.

Happy Learning !!

References