• Post author:
  • Post category:R Programming
  • Post last modified:March 27, 2024
  • Reading time:10 mins read
You are currently viewing How to Combine Two Columns into One in R

How do you combine/merge two or multiple columns into one column in R? Combining two columns into one column in R is a common operation when working with data, and there are several ways to achieve this; for example, using base R functions and the dplyr package.

Key Points

  • Combining two columns into a single column in R is often necessary when dealing with datasets where related information is spread across multiple columns.
  • The paste() function is commonly used for combining two or more columns into a single character column, allowing customization of the separator between the values.
  • When combining columns, it’s important to consider the data types of the original columns and whether the result should be a character vector, factor, or another data type.
  • Users can use the paste() function within functions like mutate() from the dplyr package to create new variables or update existing ones within a data frame.
  • Handling missing or NA values is crucial when combining columns, and functions like paste0() or paste(..., collapse = "") can be used to concatenate without introducing separators between elements.

1. Quick Examples of Combining Two Columns into One in R

If you are in a hurry, below are quick examples of combining two columns into one.


# Below are the quick examples of combining of two columns 

# Example 1: Using paste function with default separator
df$Location <- paste(df$City, df$State)

# Example 2: Combining two columns into a new column
# using paste() with specified separator
df$Location <- paste(df$City, df$State, sep = ", ")

# Example 3: Using paste0 function (no separator)
df$Location <- paste0(df$City, df$State)

# Example 4: Using interaction() function
df$Location <- interaction(df$City, df$State, sep = "")

# Example 5: Using sprintf() function
df$Location <- sprintf("%s%s", df$City, df$State)

# Example 6: Using the dplyr package
library(dplyr)
df <- df %>% mutate(Location = paste(City, State))

Let’s create an R DataFrame and explore examples and output.


# Create dataframe
df <- data.frame(
  City = c("New York", "Los Angeles", "Chicago", "Houston"),
  State = c("NY", "CA", "IL", "TX"),
  Population = c(8175133, 3792621, 2695598, 2328066)
)

# Displaying the original dataframe
print("Original DataFrame:")
print(df)

Yields below output.

R combine two columns into one

2. Combine Two Columns into One using paste()

The paste function is used to concatenate the values of two specified columns of the R data frame into one column for each row. In the example below, I combine the columns of City and State into one column named Location for each row. By default, this function uses space as a delimiter to separate data.


# Using paste function with default separator
print("DataFrame with Combined Column:")
df$Location <- paste(df$City, df$State)
print(df)

Yields below output.

R combine two columns into one

3. Using paste() with Customized Separator

Alternatively, you can use the paste() function to merge the data frame columns into one column separated by a comma and a space. Below example, specifies the separator with comma and space and passes it into a paste() function along with specified columns that we want to combine. It will combine both specified columns named City and State into one column named Location.


# Combining two columns into a new column
# using paste() with specified separator
df$Location <- paste(df$City, df$State, sep = ", ")

# Displaying the updated dataframe with the new combined column
print("DataFrame with Combined Column:")
print(df)

Yields below output.


# Output:
# [1] "Dataframe with Combined Column:"
         City State Population        Location
1    New York    NY    8175133    New York, NY
2 Los Angeles    CA    3792621 Los Angeles, CA
3     Chicago    IL    2695598     Chicago, IL
4     Houston    TX    2328066     Houston, TX

4. Merge Two Columns into One using paste0()

The paste0() function is used for the concatenation of two columns into one column of the R data frame without any separator. This code creates a new column callLocation in the data frame df. It takes the values from the City column and concatenates them with the corresponding values from the State column without any separator.


# Using paste0 function (no separator)
df$Location <- paste0(df$City, df$State)
print("DataFrame with Combined Column:")
print(df)

Yields below output.


# Output:
         City State Population      Location
1    New York    NY    8175133    New YorkNY
2 Los Angeles    CA    3792621 Los AngelesCA
3     Chicago    IL    2695598     ChicagoIL
4     Houston    TX    2328066     HoustonTX

5. Combine Two Columns into One using interaction()

The interaction() function is typically used to create a factor variable that represents the interaction of two or more categorical variables. For example, pass specified columns along with the separator into the interaction () function. It will merge the specified columns City and State into one column named Location.


# Using interaction() function
df$Location <- interaction(df$City, df$State, sep = "")
print(df)

The output is the same as the above.

6. Using sprintf() Function

Similarly, you can combine/merge the two columns into one column in an R data frame using the sprint() function. For example, sprintf("%s%s", df$City, df$State) this code returns the column named Location by combining two columns named City and State of the data frame. in R. where, where %s is a placeholder for a character string


# Using sprintf() function
df$Location <- sprintf("%s%s", df$City, df$State)
print(df)

The output is the same as the above.

6. Using dplyr Package

So far, we have seen examples of combining the columns of a data frame into one column using the base R functions. Now, we will use a dplyr package function mutate() to combine the columns into one column.


# Using the dplyr package
library(dplyr)
df <- df %>% mutate(Location = paste(City, State))
print(df)

Yields below output. Here, df %>% mutate(Location = paste(City, State)) Here, the %>% operator is used to pipe the DataFrame df into the mutate function from the dplyr package. The mutate function is used to add new variables or modify existing variables. The paste() function is used to concatenate the values of the ‘City‘ and 'State' columns together


# Output>
         City State Population       Location
1    New York    NY    8175133    New York NY
2 Los Angeles    CA    3792621 Los Angeles CA
3     Chicago    IL    2695598     Chicago IL
4     Houston    TX    2328066     Houston TX

7. Conclusion

In this article, I have explained how to combine/merge two data frame columns into one column in R uthe sing paste() function and mutate() function the from dplyr package. The paste() function is commonly used for combining two or more columns into a single character column, allowing customization of the separator between the values. You can also use the paste() function within functions like mutate() from the dplyr package to create new variables or update existing ones within a data frame.

Happy learning!!

Vijetha

Vijetha is an experienced technical writer with a strong command of various programming languages. She has had the opportunity to work extensively with a diverse range of technologies, including Python, Pandas, NumPy, and R. Throughout her career, Vijetha has consistently exhibited a remarkable ability to comprehend intricate technical details and adeptly translate them into accessible and understandable materials. Follow me at Linkedin.