We can split the Pandas DataFrame into smaller DataFrames by using single/multiple column values. Pandas provide various features and functions for splitting DataFrame into smaller ones by using index/value of column index, row index.
In this article, I will explain how to split a Pandas DataFrame based on column value condition and also I explain using the df.groupby()
function how we can split the DataFrame based on single column value/multiple column values.
1. Quick Examples of Split DataFrame by Column Value
Following are the quick examples of split Pandas DataFrame by column value.
# Below are the quick examples.
# Example 1: Split DataFrame based on column value condition
df1 = df[df['Fee'] <= 25000]
# Example 2: Split DataFrame based on Duration == 35days
df1 = df[df['Duration'] == '35days']
# Example 3: Split Dataframe using groupby() &
# Grouping by particular dataframe column
grouped = df.groupby(df.Duration)
df1 = grouped.get_group("35days")
# Example 4: Split Dataframe using groupby() &
# Grouping by multiple columns
grouped = df.groupby(['Discount', 'Fee'])
df1 = grouped.get_group((1000, 23000))
Let’s create Pandas DataFrame using data from a Python dictionary, where the columns are 'Courses'
, 'Fee'
, 'Discount'
, and 'Duration'
.
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark", "PySpark", "Hadoop", "Python", "Pandas"],
'Fee' :[22000, 25000, 23000, 24000, 26000],
'Discount':[1000, 2300, 1000, 1200, 2500],
'Duration':['35days', '35days', '40days', '30days', '25days']
}
df = pd.DataFrame(technologies)
print(df)
Yields below output.

2. Split DataFrame based on Column Value condition
We can create smaller DataFrames from a given DataFrame based on a specified column value by using the condition. Using the below syntax we can split the given DataFrame into smaller DataFrame using conditions based on specified column value.
# Split DataFrame based on column value condition
df1 = df[df['Fee'] <= 25000]
print(df1)
Yield below output.

In another example, I will apply condition 'Duration' == '35days'
on the given DataFrame. It splits the DataFrame based on the condition and returns the smaller DataFrame.
# Split DataFrame based on Duration == 35days
df1 = df[df['Duration'] == '35days']
print(df1)
Yields below output.
# Output:
Courses Fee Discount Duration
0 Spark 22000 1000 35days
1 PySpark 25000 2300 35days
3. Split Dataframe by unique Column Value
The Pandas.groupby() function is used to split the DataFrame based on column values. First, we can group the DataFrame on column values using the groupby() function after that we can select specified groups using the get_group() function. This is the best function when we want to split a DataFrame based on the specified column that has unique values.
# Split Dataframe using groupby() &
# Grouping by particular dataframe column
grouped = df.groupby(['Duration'])
df1 = grouped.get_group("35days")
print(df1)
Yields below output.
# Output:
Courses Fee Discount Duration
0 Spark 22000 1000 35days
1 PySpark 25000 2300 35days
The above syntax has returned a new DataFrame consisting of grouped data where 'Duration'
is '35days'
.
4. Pandas Split by Multiple Column Values
We can also use the groupby()
function to perform the splitting of more than one column of a given Dataframe. For that, we need to specify more than one column that we want to group using the groupby() function and select specified groups using the get_group()
function. For example.
# Split Dataframe apply groupby() on multiple columns
grouped = df.groupby(['Discount', 'Fee'])
df1 = grouped.get_group((1000, 23000))
print(df1)
Yields below output.
# Output:
Courses Fee Discount Duration
2 Hadoop 23000 1000 40days
5. Conclusion
In this article, I have explained how to split a Pandas DataFrame based on column value condition and also I explain using the df.groupby()
function how we can split the DataFrame based on single column value/multiple column values.
Happy learning!!