Pandas Series.str.the split()
function is used to split the one-string column value into two columns based on a specified separator or delimiter. This function works the same as Python.string.split()
method, but the split() method works on all Dataframe columns, whereas the Series.str.split() function works on specified columns.
In this article, I will explain Series.str.split() and using its syntax and parameters how we can split a column into multiple columns in Pandas with examples.
Related: Split Pandas DataFrame by Column Value.
1. Quick Examples of Split Column into Two Columns
Following are quick examples of splitting a string column into two columns.
# Below are the quick examples.
# Example 1: Split column of lists into two new columns
Split string column into two new columns
df[['First Name', 'Last Name']] = df.Student_details.str.split("_", expand = True)
# Example 2: Split single column into two columns use ',' delimiter
df[['First Name', 'Last Name']] = df.Student_details.str.split(",", expand = True)
# Example 3: Split single column into two columns use ',' delimiter
df[['First Name', 'Last Name']] = df.Student_details.str.split(",", expand = True)
# Example 4: Split single column into two columns use apply()
df[['First Name', 'Last Name']] = df["Student_details"].apply(lambda x: pd.Series(str(x).split(",")))
# Example 5: Split single column into two columns use apply()
df[['First Name', 'Last Name']] = df["Student_details"].apply(lambda x: pd.Series(str(x).split("_")))
2. Syntax of Series.str.split()
Following is the syntax of Series.str.split()
.
# Syntax of Series.str.split()
Series.str.split(pat=None, n=-1, expand=False)
2.1 Parameters of Series.str.split()
pat:
It is a delimiter symbol, is used to split a single column into two columns. By default it is whitespace.n:
(int type) Is a number of splits, default is -1.expand:
(bool type)The default is False. If it is set to True, this function will return DataFrame. By default, it returns Series.
2.2 Return Value
It returns DataFrame/Series
3. Usage of Series.str.split()
Pandas provide Series.str.split() function that is used to split the string column value into two or multiple columns along with a specified delimiter. Delimited string values are multiple values in a single column that are separated by dashes, whitespace, comma, etc. This function returns Pandas Series or DataFrame.
Let’s create Pandas DataFrame using data from a Python dictionary I have a DataFrame with one (string) column named 'Student_details'
and I would like to split it into two (string) columns named 'First Name'
, and 'Last Name'
.
import pandas as pd
import numpy as np
technologies = {
'Student_details':["Pramodh_Roy", "Leena_Singh", "James_William", "Addem_Smith"],
'Courses':["Spark", "PySpark", "Pandas", "Hadoop"],
'Fee' :[25000, 20000, 22000, 25000]
}
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
Yields below output.
4. Split String Column into Two Columns in Pandas
Apply Pandas Series.str.split()
on a given DataFrame column to split into multiple columns where the column has delimited string values. Here, I specified the '_'
(underscore) delimiter between the string values of one of the columns (which we want to split into two columns) of our DataFrame. So we pass '_'
as the first argument to the Series.str.split() function.
Let’s apply the above function and split the column into two columns,
# Split string column into two new columns
df[['First Name', 'Last Name']] = df.Student_details.str.split("_", expand = True)
print("After splitting a column into two columns:\n", df)
Yields below output.
5. Use ‘,’ Delimiter & Split Column
In this example, I specified the ','
(comma) delimiter between the string values of one of the columns (which we want to split into two columns) of Our DataFrame.
# Create One of the column of DataFrame
# Contain ',' delimiter values
'Student_details':["Pramodh, Roy", "Leena, Singh", "James, William", "Addem, Smith"]
# Split single column into two columns use ',' delimiter
df[['First Name', 'Last Name']] = df.Student_details.str.split(",", expand = True)
print("After splitting a column into two columns:\n", df)
Yields below output’
# Output:
After splitting a column into two columns:
Student_details Courses Fee First Name Last Name
0 Pramodh, Roy Spark 25000 Pramodh Roy
1 Leena, Singh PySpark 20000 Leena Singh
2 James, William Pandas 22000 James William
3 Addem, Smith Hadoop 25000 Addem Smith
6. Use apply() Function Split Column into Two Columns In Pandas
In Pandas, the apply() function is used to execute a function that can be used to split one column value into multiple columns. For that, we have to pass the lambda function and Series.str.split() into pandas apply() function, then call the DataFrame column, which we want to split into two columns.
# Split single column into two columns use apply()
df[['First Name', 'Last Name']] = df["Student_details"].apply(lambda x: pd.Series(str(x).split(",")))
print("After splitting a column into two columns:\n", df)
Yields below output.
# Output:
After splitting a column into two columns:
Student_details Courses Fee First Name Last Name
0 Pramodh, Roy Spark 25000 Pramodh Roy
1 Leena, Singh PySpark 20000 Leena Singh
2 James, William Pandas 22000 James William
3 Addem, Smith Hadoop 25000 Addem Smith
6.1 Using Underscore(_)
In this example, I have separated one of the column values of a given DataFrame using (‘_’) underscore delimiter. We pass ‘_’ as a param of the split() function along with lambda and apply() function.
# Create One of the column of DataFrame
# Contain '_' delimiter values
'Student_details':["Pramodh_Roy", "Leena_Singh", "James_William", "Addem_Smith"]
# Split single column into two columns use apply()
df[['First Name', 'Last Name']] = df["Student_details"].apply(lambda x: pd.Series(str(x).split("_")))
print("After splitting a column into two columns:\n", df)
Yields below output.
# Output:
After splitting a column into two columns:
Student_details Courses Fee First Name Last Name
0 Pramodh_Roy Spark 25000 Pramodh Roy
1 Leena_Singh PySpark 20000 Leena Singh
2 James_William Pandas 22000 James William
3 Addem_Smith Hadoop 25000 Addem Smith
Frequently Asked Questions of Split Column
You can use the str.split()
method in Pandas to split a column into two or more columns based on a delimiter or separator.
You can specify the delimiter or separator as an argument in the str.split()
method. For example, if you want to split based on a comma, you can use df['column'].str.split(',')
.
You can control the number of splits using the n
parameter in str.split()
. For instance, df['column'].str.split(',', n=1)
will split the column at the first occurrence of the delimiter.
You can create new columns in the DataFrame to assign the split values. For example, you can use df[['new_column1', 'new_column2']] = df['column'].str.split(',', expand=True)
You can use a custom function with the apply()
method to split a column based on your specific logic.
7. Conclusion
In this article, I have explained Series.str.split()
function and how to split Pandas DataFrame string column into multiple columns using its syntax and parameters. Also, I have used the apply() function in some examples for splitting one string column into two columns.
Thank you!