The groupby() function in the Pandas Series is a powerful tool for grouping data based on certain criteria. The groupby operation is used to split a DataFrame into groups based on some criteria, and then apply a function to each group independently. When you’re working with a Series, you can still use groupby similarly.

You can group the Pandas Series and calculate various operations on grouped data in many ways, for example, by using `groupby()`

including `sum()`

, `mean()`

, `count()`

, `min()`

, and `max()`

functions. In this article, I will explain the Pandas Series groupby() function and using its syntax, parameters, and usage how we can group the data in the series with multiple examples.

**Key Points –**

- Pandas Series
`groupby()`

is used for grouping data based on a specified criterion, allowing you to analyze and manipulate subsets of the data independently. - The
`groupby()`

operation follows the split-apply-combine paradigm. It splits the data into groups, applies a function to each group, and then combines the results into a new data structure. - The primary use of
`groupby()`

is for aggregation, where you can calculate summary statistics (e.g., sum, mean, count) for each group. Additionally, it supports transformations, allowing you to modify the data within each group. `groupby()`

is valuable for analyzing categorical data, enabling insights into patterns and trends within different categories or levels of a variable.- The groups created by
`groupby()`

often serve as an index or label for the results. The`as_index`

parameter allows you to control whether the group labels become part of the index in the output.

## Syntax of Series groupby()

Following is the syntax of Series groupby()

```
# Syntax of series groupby()
Series.groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=<object object>, observed=False, **kwargs)
```

### Parameters of Series groupby()

Following are the parameters of the Series groupby() function.

`by`

– This parameter specifies the grouping information. It can be a single label or a list of labels referring to the axis items to be used for grouping. It could also be a function, Series, or dictionary mapping index to group.`axis`

– The axis along which to group. The default is 0 (rows). You can use 1 for columns.`level`

– For a MultiIndex, level to use for grouping.`as_index`

– If True, the group labels will be used as index names. If False, the resulting object will have an index of integers.`sort`

– Sort the group keys. The default is True.`group_keys`

– When calling apply, add group keys to the index to identify pieces.`squeeze`

– Reduce the dimensionality of the return type if possible. If True, return a Series if there is one group.`observed`

– This is only relevant for categorical data and determines whether to use all categories or only observed categories.`**kwargs`

– Additional keyword arguments are passed to the groupby function.

### Return Value

It returns a `GroupBy`

object. This object is an intermediate data structure that represents a mapping of keys to corresponding groups. The actual computation or transformation is often performed after the `groupby()`

operation.

Let’s create a Pandas Series with a customized index and column.

```
# Imports pandas
import pandas as pd
import numpy as np
# Create a sample Series
data = {'Courses': ["Spark","Python","Spark","Pandas","Python","Pandas"],
'Fee': [22000,25000,23000,24000,26000,30000]}
ser = pd.Series(data['Fee'], index=data['Courses'])
print("Pandas Series:\n", ser)
```

Yields below output.

## Group by Pandas Series Unique Values and Calculate the Sum

If you want to group a pandas Series by its unique values and calculate the sum for each group, you can use the `groupby()`

function along with an aggregation function like `sum()`

.

```
# Imports pandas
import pandas as pd
import numpy as np
# Group by unique values and calculate the sum
grouped_series = ser.groupby(ser.index).sum()
print("Get the sum of grouped data:\n",grouped_series)
# Group by the 'Courses' and calculate the sum for each group
grouped_series = ser.groupby(level=0).sum()
print("Get the sum of grouped data:\n",grouped_series)
```

In the above example, the `groupby(ser.index)`

groups the Series by its unique values in the `Courses`

column. Then, the `sum()`

function is applied to calculate the sum of fees for each unique course. The result is displayed as a new Series, where the index represents unique course names, and values represent the corresponding sums of fees. This example yields the below output.

## Group by Custom Categories and Calculate the Max

Alternatively, you can use the groupby() function along with the max() aggregation function to group by custom categories and calculate the maximum value for each group. Create the custom category using a dictionary and pass it into the groupby() function. It will return the groupby object. After getting the groupby object use max() function to get the maximum values of grouped data.

```
# Define custom categories
custom_categories = {'Spark': 'Programming', 'Python': 'Programming', 'Pandas': 'Data Analysis'}
# Group by custom categories and calculate the max
grouped_series = ser.groupby(custom_categories).max()
print("Get the maximum value of grouped data:\n", grouped_series)
# Output:
# Get the maximum value of grouped data:
# Data Analysis 30000
# Programming 26000
# dtype: int64
```

In the above example, `custom_categories`

is a dictionary that maps each course in the `Courses`

index to a custom category. The `groupby(custom_categories)`

groups the given Series based on these custom categories and then `max()`

is applied to calculate the maximum fee for each category.

## Group by String Length and Count Occurrences

Similarly, to group by the length of strings in a pandas Series and count the occurrences of each string length, you can use the `groupby()`

function along with the `str.len()`

method and `count()`

aggregation.

```
# Imports pandas
import pandas as pd
# Create a sample Series
data = {'Courses': ["Spark", "Python", "Java", "Pandas", "C", "R"]}
ser = pd.Series(data['Courses'])
print("Pandas Series:\n", ser)
# Group by string length and count occurrences
grouped_series = ser.groupby(ser.str.len()).count()
print("Group by string length and count occurrences:\n", grouped_series)
# Output:
# Pandas Series:
# 0 Spark
# 1 Python
# 2 Java
# 3 Pandas
# 4 C
# 5 R
# dtype: object
# Group by string length and count occurrences:
# 1 2
# 4 1
# 5 1
# 6 2
# dtype: int64
```

In the above example, `ser.str.len()`

is used to get the length of each string in the Series. The `groupby(ser.str.len())`

groups the Series based on these string lengths, and then `count()`

is applied to calculate the occurrences for each string length.

## Group by Even or Odd Values and Calculate the Mean

To group a pandas Series by whether its values are even or odd, and then calculate the mean for each group, you can use the `groupby()`

function along with a custom grouping function and the mean() aggregation.

```
# Imports pandas
import pandas as pd
# Create a sample Series
ser = pd.Series([1, 2, 3, 4, 5, 6])
# Group by even or odd values and calculate the mean
grouped_series = ser.groupby(ser % 2).mean()
print("Group by even or odd values and calculate the mean:\n", grouped_series)
# Output:
# Group by even or odd values and calculate the mean:
# 0 4
# 1 3
# dtype: int64
```

Here,

- The
`ser % 2`

creates groups based on whether each value in the Series is even (group 0) or odd (group 1). - The
`groupby()`

function is used to group the Series based on these groups. - Finally, the
`mean()`

function is applied to calculate the mean for each group.

## Group by Custom Function and Calculate the Mean

Similarly, You can also group a pandas Series by a custom function and then calculate the mean for each group, you can use the `groupby()`

function along with the custom function and `mean()`

aggregation. For example,

```
# Imports pandas
import pandas as pd
# Create a sample Series
ser = pd.Series([10, 20, 30, 40, 50])
# Group by custom function and calculate the mean
grouped_series = ser.groupby(lambda x: 'even' if x % 2 == 0 else 'odd').mean()
print(grouped_series)
# Output:
# even 30
# odd 30
# dtype: int64
```

Here,

- The lambda function
`lambda x: 'even' if x % 2 == 0 else 'odd'`

is used as a custom function. This function categorizes each value in the Series as either ‘even’ or ‘odd’ based on whether the value is divisible by 2. - The
`groupby()`

function is applied to group the Series based on the result of the custom function. In this case, it creates two groups: one for even numbers and one for odd numbers - Finally, the
`mean()`

function is used to calculate the mean for each group.

## Group by Boolean Condition and Calculate the Sum

If you want to group a pandas Series by a boolean condition and calculate the sum for each group, you can use a boolean condition directly within the `groupby()`

function and then apply the `sum()`

function.

```
# Imports pandas
import pandas as pd
# Create a sample Series
ser = pd.Series([10, 20, 30, 40, 50])
# Group by boolean condition and calculate the sum
ser = pd.Series([10, 20, 30, 40, 50])
result = ser.groupby(ser > 30).sum()
print(result)
# Output:
# False 60
# True 90
# dtype: int64
```

Here,

- The condition
`ser > 30`

creates a boolean Series where`True`

represents values greater than 30, and`False`

represents values less than or equal to 30. - The
`groupby()`

function is applied to group the Series based on this boolean condition, creating two groups: one for values greater than 30 (`True`

) and one for values less than or equal to 30 (`False`

). - Finally, the
`sum()`

function is used to calculate the sum for each group.

## Frequently Asked Questions on Pandas Series groupby() Function

**What does the groupby() function do in Pandas Series?**

The `groupby()`

function in Pandas Series is used to group data based on specified criteria. It involves splitting the data into groups, applying a function to each group, and then combining the results.

**How do I use the groupby() function with a custom grouping function?**

To use the `groupby()`

function with a custom grouping function in a Pandas Series, you need to pass the result of your custom function as an argument to the `groupby()`

method.

**What are some common aggregation functions used with groupby()?**

Common aggregation functions used with `groupby()`

include `sum()`

, `mean()`

, `count()`

, `min()`

, `max()`

, and `agg()`

for custom aggregations.

**How do I group by the index of a Series?**

To group by the index of a Pandas Series, you can use the `groupby()`

function and specify the `level`

parameter with the index level you want to use for grouping.

**Can I group by a categorical column?**

Pandas supports grouping by categorical columns. When you use `groupby()`

on a categorical column, it respects the order of categories and groups the data accordingly.

## Conclusion

In this article, I have explained the groupby() function in the Pandas Series, and using its syntax, parameters, and usage how to group the data in a Series based on some criteria and then perform various operations on each group.

Happy Learning !!

## Related Articles

- Pandas Iterate Over Series
- Pandas.Series.combine() Function
- Pandas Series.isin() Function
- Convert Pandas Series to DataFrame
- Pandas.Series.combine() function
- Pandas Series loc[] Function
- Convert Pandas Series to String
- Pandas Get Floor or Ceil of Series
- How to Rename a Pandas Series
- Pandas Series sum() Function
- Pandas Series unique() Function with Examples
- How to Get the Length of a Series in Pandas?
- Find Intersection Between Two Series in Pandas?
- How to Convert NumPy Array to Pandas Series?
- Pandas Stack Two Series Vertically and Horizontally