• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:15 mins read
You are currently viewing Pandas Split Column into Two Columns

Pandas Series.str.the split() function is used to split the one-string column value into two columns based on a specified separator or delimiter. This function works the same as Python.string.split() method, but the split() method works on all Dataframe columns, whereas the Series.str.split() function works on specified columns.

In this article, I will explain Series.str.split() and using its syntax and parameters how we can split a column into multiple columns in Pandas with examples.

Related: Split Pandas DataFrame by Column Value.

1. Quick Examples of Split Column into Two Columns

Following are quick examples of splitting a string column into two columns.


# Below are the quick examples.

# Example 1: Split column of lists into two new columns
 Split string column into two new columns
df[['First Name', 'Last Name']] = df.Student_details.str.split("_", expand = True)

# Example 2: Split single column into two columns use ',' delimiter
df[['First Name', 'Last Name']] = df.Student_details.str.split(",", expand = True)

# Example 3: Split single column into two columns use ',' delimiter
df[['First Name', 'Last Name']] = df.Student_details.str.split(",", expand = True)

# Example 4: Split single column into two columns use apply()
df[['First Name', 'Last Name']] = df["Student_details"].apply(lambda x: pd.Series(str(x).split(",")))

# Example 5: Split single column into two columns use apply()
df[['First Name', 'Last Name']] = df["Student_details"].apply(lambda x: pd.Series(str(x).split("_")))

2. Syntax of Series.str.split()

Following is the syntax of Series.str.split().


# Syntax of Series.str.split()
Series.str.split(pat=None, n=-1, expand=False)

2.1 Parameters of Series.str.split()

  • pat: It is a delimiter symbol, is used to split a single column into two columns. By default it is whitespace.
  • n: (int type) Is a number of splits, default is -1.
  • expand: (bool type)The default is False. If it is set to True, this function will return DataFrame. By default, it returns Series.

2.2 Return Value

It returns DataFrame/Series

3. Usage of Series.str.split()

Pandas provide Series.str.split() function that is used to split the string column value into two or multiple columns along with a specified delimiter. Delimited string values are multiple values in a single column that are separated by dashes, whitespace, comma, etc. This function returns Pandas Series or DataFrame.

Let’s create Pandas DataFrame using data from a Python dictionary I have a DataFrame with one (string) column named 'Student_details' and I would like to split it into two (string) columns named 'First Name', and 'Last Name'.


import pandas as pd
import numpy as np
technologies = {
    'Student_details':["Pramodh_Roy", "Leena_Singh", "James_William", "Addem_Smith"],
    'Courses':["Spark", "PySpark", "Pandas",  "Hadoop"],
    'Fee' :[25000, 20000, 22000, 25000]
              }
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)

Yields below output.

pandas split column

4. Split String Column into Two Columns in Pandas

Apply Pandas Series.str.split() on a given DataFrame column to split into multiple columns where the column has delimited string values. Here, I specified the '_'(underscore) delimiter between the string values of one of the columns (which we want to split into two columns) of our DataFrame. So we pass '_' as the first argument to the Series.str.split() function.

Let’s apply the above function and split the column into two columns,


# Split string column into two new columns
df[['First Name', 'Last Name']] = df.Student_details.str.split("_", expand = True)
print("After splitting a column into two columns:\n", df)

Yields below output.

pandas split column

5. Use ‘,’ Delimiter & Split Column

In this example, I specified the ','(comma) delimiter between the string values of one of the columns (which we want to split into two columns) of Our DataFrame.


# Create One of the column of DataFrame 
# Contain ',' delimiter values
'Student_details':["Pramodh, Roy", "Leena, Singh", "James, William", "Addem, Smith"]
    
# Split single column into two columns use ',' delimiter
df[['First Name', 'Last Name']] = df.Student_details.str.split(",", expand = True)
print("After splitting a column into two columns:\n", df)

Yields below output’


# Output:
After splitting a column into two columns:
  Student_details  Courses    Fee First Name Last Name
0    Pramodh, Roy    Spark  25000    Pramodh       Roy
1    Leena, Singh  PySpark  20000      Leena     Singh
2  James, William   Pandas  22000      James   William
3    Addem, Smith   Hadoop  25000      Addem     Smith

6. Use apply() Function Split Column into Two Columns In Pandas

In Pandas, the apply() function is used to execute a function that can be used to split one column value into multiple columns. For that, we have to pass the lambda function and Series.str.split() into pandas apply() function, then call the DataFrame column, which we want to split into two columns.


# Split single column into two columns use apply()
df[['First Name', 'Last Name']] = df["Student_details"].apply(lambda x: pd.Series(str(x).split(",")))
print("After splitting a column into two columns:\n", df)

Yields below output.


# Output:
After splitting a column into two columns:
 Student_details  Courses    Fee First Name Last Name
0    Pramodh, Roy    Spark  25000    Pramodh       Roy
1    Leena, Singh  PySpark  20000      Leena     Singh
2  James, William   Pandas  22000      James   William
3    Addem, Smith   Hadoop  25000      Addem     Smith

6.1 Using Underscore(_)

In this example, I have separated one of the column values of a given DataFrame using (‘_’) underscore delimiter. We pass ‘_’ as a param of the split() function along with lambda and apply() function.


# Create One of the column of DataFrame 
# Contain '_' delimiter values
'Student_details':["Pramodh_Roy", "Leena_Singh", "James_William", "Addem_Smith"]

# Split single column into two columns use apply()
df[['First Name', 'Last Name']] = df["Student_details"].apply(lambda x: pd.Series(str(x).split("_")))
print("After splitting a column into two columns:\n", df)

Yields below output.


# Output:
After splitting a column into two columns:
  Student_details  Courses    Fee First Name Last Name
0     Pramodh_Roy    Spark  25000    Pramodh       Roy
1     Leena_Singh  PySpark  20000      Leena     Singh
2   James_William   Pandas  22000      James   William
3     Addem_Smith   Hadoop  25000      Addem     Smith

Frequently Asked Questions of Split Column

How do I split a column into two columns in Pandas?

You can use the str.split() method in Pandas to split a column into two or more columns based on a delimiter or separator.

What if I want to split based on a specific character or string?

You can specify the delimiter or separator as an argument in the str.split() method. For example, if you want to split based on a comma, you can use df['column'].str.split(',').

How can I control the number of splits?

You can control the number of splits using the n parameter in str.split(). For instance, df['column'].str.split(',', n=1) will split the column at the first occurrence of the delimiter.

How can I split a column and assign the results to new columns in the same DataFrame?

You can create new columns in the DataFrame to assign the split values. For example, you can use df[['new_column1', 'new_column2']] = df['column'].str.split(',', expand=True)

How can I split a column into two columns based on a custom function or logic?

You can use a custom function with the apply() method to split a column based on your specific logic.

7. Conclusion

In this article, I have explained Series.str.split() function and how to split Pandas DataFrame string column into multiple columns using its syntax and parameters. Also, I have used the apply() function in some examples for splitting one string column into two columns.

References

Vijetha

Vijetha is an experienced technical writer with a strong command of various programming languages. She has had the opportunity to work extensively with a diverse range of technologies, including Python, Pandas, NumPy, and R. Throughout her career, Vijetha has consistently exhibited a remarkable ability to comprehend intricate technical details and adeptly translate them into accessible and understandable materials. Follow me at Linkedin.

This Post Has One Comment

  1. Renato

    Thank you!

Comments are closed.