• Post author:
• Post category:Pandas

You can calculate the percentage of the total within each group using `DataFrame.groupby()` along with `DataFrame.agg()`, `DataFrame.transform()`, and `DataFrame.apply()` with `lambda` function. You can also calculate the percentage by using `sum` and `divide` functions.

In this article, You can find out how to calculate the percentage total of pandas DataFrame with some examples.

## 1. Quick Examples of Pandas Percentage Total by Groupby

If you are in a hurry below are some quick examples of calculating the percentage total of Pandas DataFrame.

``````
# Below are some quick examples.

# EXample 1: Using DataFrame.agg() Method.
df2 = df.groupby(['Courses', 'Fee']).agg({'Fee': 'sum'})

# EXample 2: Percentage by lambda and DataFrame.apply() method.
df3 = df2.groupby(level=0).apply(lambda x:100 * x / float(x.sum()))

# EXample 3: Using DataFrame.div() method.
df2 = df.groupby(['Courses', 'Fee']).agg({'Fee': 'sum'})
Courses = df.groupby(['Courses']).agg({'Fee': 'sum'})
df2.div(Courses, level='Courses') * 100

# EXample 4: Using groupby with DataFrame.rename() Method.
df2= df.groupby(['Courses', 'Fee'])['Fee'].sum().rename("count")

# EXample 5: Using DataFrame.transform() method.
df['%'] = 100 * df['Fee'] / df.groupby('Courses')['Fee'].transform('sum')

# EXample 6: Alternative method of DataFrame.transform() by lambda functions.
df['Courses_Fee'] = df.groupby(['Courses'])['Fee'].transform(lambda x: x/x.sum())

# EXample 7: Caluclate groupby with DataFrame.rename() and DataFrame.transform() with lambda functions.
df2=df.groupby(['Courses', 'Fee'])['Fee'].sum().rename("Courses_fee").groupby(level = 0).transform(lambda x: x/x.sum())
``````

Now, Let’s create a Pandas DataFrame with a few rows and columns, execute these examples, and validate the results that calculate the percentage total of the Pandas DataFrame.

``````
# Create a Pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark","PySpark","Spark","Python","PySpark"],
'Fee' :[22000,25000,23000,24000,26000],
'Duration':['30days','50days','30days', None,np.nan]
}
df = pd.DataFrame(technologies)
print("Create DataFrame:\n", df)
``````

Yields below output.

## 2. Pandas Calculate percentage with Groupby With .agg() Method

You can calculate the percentage of the total within each group using the DataFrame.groupby() method along with `agg()` function. You can use the `groupby()` method on the DataFrame `df` to group it by the columns `'Courses'` and `'Fee'.` Then, you can apply the agg(aggregate) function to perform an aggregation operation on the grouped data.

Let’s calculate the percentage of the total “Fee” within each `"course"` group.

``````
# Using DataFrame.agg() Method.
df2 = df.groupby(['Courses', 'Fee']).agg({'Fee': 'sum'})
print("Get percentage of the total Fee within each course group:\n", df2)
``````

Yields below output.

You can use the `groupby()` method on the DataFrame` df2` to group it by the first level of the index (level 0). Then, you can apply a lambda function using the `apply()` method to calculate the percentage of the total for each group.

After grouping, the `apply()` method is used to apply a lambda function to each group. The lambda function calculates the percentage of the total for each group. It takes each value (`x`) in the group, multiplying it by 100, and then dividing by the sum of all values in the group.

``````
# Percentage by lambda and DataFrame.apply() method.
df3 = df2.groupby(level=0).apply(lambda x:100 * x / float(x.sum()))
print(df3)
``````

Yields below output.

``````
# Output:
Courses Fee
PySpark 25000   49.019608
26000   50.980392
Python  24000  100.000000
Spark   22000   48.888889
23000   51.111111
``````

Another method is to calculate the percentage of the total for each group using `DataFrame.div()` method. Here `div` tells pandas to join the DataFrame based on the values at the `Courses` level of the `index`.

``````
# Using DataFrame.div() method.
df2 = df.groupby(['Courses', 'Fee']).agg({'Fee': 'sum'})
Courses = df.groupby(['Courses']).agg({'Fee': 'sum'})
df3=df2.div(Courses, level='Courses') * 100
print(df3)
``````

Yields output same as above.

## 3. Using groupby with DataFrame.transform() Method

You can also calculate the total percentage within each group using groupby() along with DataFrame.transform() method. The transform() method allows you to execute a function for each value of the DataFrame. Here, the percentage is directly summarized in DataFrame, then the results will be calculated using all the data.

``````
# Using DataFrame.transform() method.
df['%'] = 100 * df['Fee'] / df.groupby('Courses')['Fee'].transform('sum')
print(df)
``````

Yields below output.

``````
# Output:
Courses    Fee Duration           %
0    Spark  22000   30days   48.888889
1  PySpark  25000   50days   49.019608
2    Spark  23000   30days   51.111111
3   Python  24000   60days  100.000000
4  PySpark  26000   35days   50.980392
``````

Alternatively, you can also calculate the percentage of the total within each group by using `DataFrame.transform()` method with `lambda` functions in which you can add the percentages as a new column, leaving the rest of the DataFrame untouched.

``````
# Alternative method of DataFrame.transform() by lambda functions.
df['Courses_Fee'] = df.groupby(['Courses'])['Fee'].transform(lambda x: x/x.sum())
print(df)
``````

Yields Output same as above.

## 4. Other Example-

You can calculate the percentage of the total of each Pandas group by using groupby along with `lambda` function.

``````
# Caluclate groupby with DataFrame.rename() and DataFrame.transform() with lambda functions.
df2=df.groupby(['Courses', 'Fee'])['Fee'].sum().rename("Courses_fee").groupby(level = 0).transform(lambda x: x/x.sum())
print(df2)
``````

Yields below output.

``````
# Output:
Courses  Fee
PySpark  25000    0.490196
26000    0.509804
Python   24000    1.000000
Spark    22000    0.488889
23000    0.511111
Name: Courses_fee, dtype: float64
``````

## 6. Complete Examples to Calculate Percentage with Groupby

Below are Complete examples to calculate percentages with groupby of pandas DataFrame.

``````
# Below are complete examples.

# Create a Pandas DataFrame.
import pandas as pd
import numpy as np
technologies= {
'Courses':["Spark","PySpark","Spark","Python","PySpark"],
'Fee' :[22000,25000,23000,24000,26000],
'Duration':['30days','50days','30days', None,np.nan]
}
df = pd.DataFrame(technologies)
print(df)

# Using DataFrame.agg() Method.
df2 = df.groupby(['Courses', 'Fee']).agg({'Fee': 'sum'})
print(df2)

# Percentage by lambda and DataFrame.apply() method.
df3 = df2.groupby(level=0).apply(lambda x:100 * x / float(x.sum()))
print(df3)

# Using DataFrame.div() method.
df2 = df.groupby(['Courses', 'Fee']).agg({'Fee': 'sum'})
Courses = df.groupby(['Courses']).agg({'Fee': 'sum'})
df2.div(Courses, level='Courses') * 100
print(df2)

# Using DataFrame.transform() method.
df['%'] = 100 * df['Fee'] / df.groupby('Courses')['Fee'].transform('sum')
print(df)

# Alternative method of DataFrame.transform() by lambda functions.
df['Courses_Fee'] = df.groupby(['Courses'])['Fee'].transform(lambda x: x/x.sum())
print(df)

# Caluclate groupby with DataFrame.rename() and DataFrame.transform() with lambda functions.
df2=df.groupby(['Courses', 'Fee'])['Fee'].sum().rename("Courses_fee").groupby(level = 0).transform(lambda x: x/x.sum())
print(df2)
``````

## Conclusion

In this article, You have learned how to calculate the percentage of the total within each Pandas group by using `DataFrame.groupby()` function along with `DataFrame.agg()`, `DataFrame.transform()` and `DataFrame.apply()` methods with `lambda` function.