# Pandas Percentage Total With Groupby

You can calculate the percentage of total with the groupby of pandas DataFrame by using `DataFrame.groupby()`, `DataFrame.agg()`, `DataFrame.transform()` methods and `DataFrame.apply()` with `lambda` function. You can also calculate percentage by `sum` and `divide` functions.

In this article, You can find out how to calculate the percentage total of pandas DataFrame with some below examples.

## 1. Quick Examples of Pandas Percentage Total by Groupby

If, You are in hurry below are some quick examples to calculate percentage total of Pandas DataFrame.

``````
# Below are some quick examples.

# Using DataFrame.agg() Method.
df2 = df.groupby(['Courses', 'Fee']).agg({'Fee': 'sum'})

# Percentage by lambda and DataFrame.apply() method.
df3 = df2.groupby(level=0).apply(lambda x:100 * x / float(x.sum()))

# Using DataFrame.div() method.
df2 = df.groupby(['Courses', 'Fee']).agg({'Fee': 'sum'})
Courses = df.groupby(['Courses']).agg({'Fee': 'sum'})
df2.div(Courses, level='Courses') * 100

# Using groupby with DataFrame.rename() Method.
df2= df.groupby(['Courses', 'Fee'])['Fee'].sum().rename("count")

# Using DataFrame.transform() method.
df['%'] = 100 * df['Fee'] / df.groupby('Courses')['Fee'].transform('sum')

# Alternative method of DataFrame.transform() by lambda functions.
df['Courses_Fee'] = df.groupby(['Courses'])['Fee'].transform(lambda x: x/x.sum())

# Caluclate groupby with DataFrame.rename() and DataFrame.transform() with lambda functions.
df2=df.groupby(['Courses', 'Fee'])['Fee'].sum().rename("Courses_fee").groupby(level = 0).transform(lambda x: x/x.sum())
``````

Now, Let’s create a pandas DataFrame with a few rows and columns, execute these examples and validate results that calculate the percentage total of pandas DataFrame.

``````
# Create a Pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark","PySpark","Spark","Python","PySpark"],
'Fee' :[22000,25000,23000,24000,26000],
'Duration':['30days','50days','30days', None,np.nan]
}
df = pd.DataFrame(technologies)
print(df)
``````

Yields below output.

``````
# Output:
Courses    Fee Duration
0    Spark  22000   30days
1  PySpark  25000   50days
2    Spark  23000   30days
3   Python  24000   60days
4  PySpark  26000   35days
``````

## 2. Pandas Calculate percentage with Groupby With .agg() Method

You can calculate the percentage by using `DataFrame.groupby()` method. It is a process involving one or more of the following steps.

• Splitting the data into groups based on some criteria.
• Applying a function to each group independently.
• Combining the results into a data structure.

The `DataFrame.agg()` method is used to pass a function or list of functions to be applied on a series or even each element of series separately. In the list of functions, multiple results are returned by `DataFrame.agg()` method.

``````
# Using DataFrame.agg() Method.
df2 = df.groupby(['Courses', 'Fee']).agg({'Fee': 'sum'})
print(df2)
``````

Yields below output.

``````
# Output:
Fee
Courses Fee
PySpark 25000  25000
26000  26000
Python  24000  24000
Spark   22000  22000
23000  23000
``````

Now, you can calculate the percentage in a simpler way just `groupby` the `Courses` and divide` Fee` column by its `sum` by `lambda` function and `DataFrame.apply()` method. Here `df2` is a Series of Multi Index with one column where values are all numeric. After you have done `groupby`, each `x` is a subset of that column and `level=0` means you are grouping by the first level of the `index`, rather than by one of the columns.

``````
# Percentage by lambda and DataFrame.apply() method.
df3 = df2.groupby(level=0).apply(lambda x:100 * x / float(x.sum()))
print(df3)
``````

Yields below output.

``````
# Output:
Courses Fee
PySpark 25000   49.019608
26000   50.980392
Python  24000  100.000000
Spark   22000   48.888889
23000   51.111111
``````

Another method to calculate total percentage with `groupby` by using `DataFrame.div()` method. Here `div` tells pandas to join the DataFrame based on the values in the `Courses` level of the `index`.

``````
# Using DataFrame.div() method.
df2 = df.groupby(['Courses', 'Fee']).agg({'Fee': 'sum'})
Courses = df.groupby(['Courses']).agg({'Fee': 'sum'})
df3=df2.div(Courses, level='Courses') * 100
print(df3)
``````

Yields output same as above.

## 3. Using groupby with DataFrame.transform() Method

You can caluclate pandas percentage with total by groupby() and `DataFrame.transform()` method. The `transform()` method allows you to execute a function for each value of the DataFrame. Here, the percentage directly summarized DataFrame, then the results will be calculated using all the data.

``````
# Using DataFrame.transform() method.
df['%'] = 100 * df['Fee'] / df.groupby('Courses')['Fee'].transform('sum')
print(df)
``````

Yields below output.

``````
# Output:
Courses    Fee Duration           %
0    Spark  22000   30days   48.888889
1  PySpark  25000   50days   49.019608
2    Spark  23000   30days   51.111111
3   Python  24000   60days  100.000000
4  PySpark  26000   35days   50.980392
``````

Alternatively, you can also caluclate percentage of total by using `DataFrame.transform()` method with `lambda` functions in which you can add the percentages as a new column, leaving the rest of the DataFrame untouched.

``````
# Alternative method of DataFrame.transform() by lambda functions.
df['Courses_Fee'] = df.groupby(['Courses'])['Fee'].transform(lambda x: x/x.sum())
print(df)
``````

Yields Output same as above.

## 4. Other Example-

You can calculate pandas percentage of total by using groupby using `lambda` function.

``````
# Caluclate groupby with DataFrame.rename() and DataFrame.transform() with lambda functions.
df2=df.groupby(['Courses', 'Fee'])['Fee'].sum().rename("Courses_fee").groupby(level = 0).transform(lambda x: x/x.sum())
print(df2)
``````

Yields below output.

``````
# Output:
Courses  Fee
PySpark  25000    0.490196
26000    0.509804
Python   24000    1.000000
Spark    22000    0.488889
23000    0.511111
Name: Courses_fee, dtype: float64
``````

## 6. Complete Examples to Caluclate Percentage with Groupby

Below are Complete examples to caluclate percentage with groupby of pandas DataFrame.

``````
# Below are complete examples.

# Create a Pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark","PySpark","Spark","Python","PySpark"],
'Fee' :[22000,25000,23000,24000,26000],
'Duration':['30days','50days','30days', None,np.nan]
}
df = pd.DataFrame(technologies)
print(df)

# Using DataFrame.agg() Method.
df2 = df.groupby(['Courses', 'Fee']).agg({'Fee': 'sum'})
print(df2)

# Percentage by lambda and DataFrame.apply() method.
df3 = df2.groupby(level=0).apply(lambda x:100 * x / float(x.sum()))
print(df3)

# Using DataFrame.div() method.
df2 = df.groupby(['Courses', 'Fee']).agg({'Fee': 'sum'})
Courses = df.groupby(['Courses']).agg({'Fee': 'sum'})
df2.div(Courses, level='Courses') * 100
print(df2)

# Using DataFrame.transform() method.
df['%'] = 100 * df['Fee'] / df.groupby('Courses')['Fee'].transform('sum')
print(df)

# Alternative method of DataFrame.transform() by lambda functions.
df['Courses_Fee'] = df.groupby(['Courses'])['Fee'].transform(lambda x: x/x.sum())
print(df)

# Caluclate groupby with DataFrame.rename() and DataFrame.transform() with lambda functions.
df2=df.groupby(['Courses', 'Fee'])['Fee'].sum().rename("Courses_fee").groupby(level = 0).transform(lambda x: x/x.sum())
print(df2)
``````

## Conclusion

In this article, You have learned how to calculate percentage with groupby of pandas DataFrame by using `DataFrame.groupby()`, `DataFrame.agg()`, `DataFrame.transform()` and `DataFrame.apply()` methods with `lambda` function. You can also calculate by `sum` and `divide` functions with examples.