To get the total or sum of a column use sum()
method, and to add the result of the sum as a row to the DataFrame use loc[]
, at[]
, append()
and pandas.Series()
methods. In this article, I will explain how to get the total/sum for a given column with examples.
Key Points –
- The
sum()
function is designed to aggregate data by calculating the total values across specified columns in a DataFrame. - By default, the
sum()
method operates on columns (axis=0), returning the sum for each column. - By default,
sum()
skipsNaN
(Not a Number) values. You can use themin_count
parameter to control how many non-null values are required to perform the sum. - The
sum()
function works with numeric data types, allowing for the summation of integers and floats. - The
sum()
method can be chained with other DataFrame methods (likefilter()
,apply()
, etc.) for more complex data manipulation and analysis. - You can assign the result of the sum calculation back to a new or existing column in the DataFrame.
Quick Examples of Get Total of Columns
If you are in a hurry, below are some quick examples of how to get the total of pandas DataFrame by a given or all column.
# Quick examples of get total of columns
# Example 1: Use DataFrame.sum() method
df2 = df['math'].sum()
# Example 2: Using DataFrame.sum() method
df2 = sum(df['math'])
# Example 3: Use DataFrame.loc[] and pandas.Series()
# To get total of columns
df.loc['Total'] = pd.Series(df['math'].sum(), index = ['math'])
# Example 4: Get total of columns
# Using DataFrame.loc[] method
df.loc['Total'] = df["math"].sum()
# Example 5: Use DataFrame.loc[] & DataFrame.sum() method
df.loc["Total", "math"] = df.math.sum()
# Example 6: Use DataFrame.at[] method
# To get total of columns
df.at['Total', "math"] = df["math"].sum()
# Example 7: Use DataFrame.append() method
df2 = df.append(pd.DataFrame(df.math.sum(), index = ["Total"], columns=[ "math"]))
Now, let’s create a DataFrame with a few rows and columns, execute these examples, and validate the results. Our DataFrame contains column names studentname
, math
, science
and english
.
# Create Pandas DataFrame
import pandas as pd
studentdetails = {
"studentname":["Ram","Sam","Scott","Ann","John"],
"math" :[80,90,85,70,95],
"science" :[85,95,80,90,75],
"english" :[90,85,80,70,95]
}
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(studentdetails ,index=index_labels)
print("Create DataFrame:\n", df)
Yields below output.
Use DataFrame.sum() Method
You can use DataFrame.sum()
method to calculate the sum/total of a column. The below example gets the total sum of math
columns. Alternatively, you can also use the sum() method that takes the Series object as an argument.
# Use DataFrame.sum() method
math_sum = df['math'].sum()
print("Get the sum of matematics column:\n", math_sum)
# Using DataFrame.sum() method
math_sum = sum(df['math'])
print("Get the sum of mathematics column:\n", math_sum)
Yields below output.
Use pandas.Series() to Get Total of Columns
Use pandas.Series()
to create a sum row at the end of the DataFrame. The index should be set as the same as the specific column you need to sum.
# Use pandas.Series() to to create new row with sum
df.loc['Total'] = pd.Series(df['math'].sum(), index = ['math'])
print("Get the sum of mathematics column:\n", math_sum)
Yields below output.
# Output:
studentname math science english
r1 Ram 80.0 85.0 90.0
r2 Sam 90.0 95.0 85.0
r3 Scott 85.0 80.0 80.0
r4 Ann 70.0 90.0 70.0
r5 John 95.0 75.0 95.0
Total NaN 420.0 NaN NaN
Get the Total of Columns Using the Series.sum() Method
Series.sum()
gets you the sum of a column. This is equivalent to the method numpy.sum
. You can assign the sum of a column to a DataFrame to create a row. Note that in this way, it creates the same value for each column. The next example solves this issue.
# Get total of columns using sum method
df.loc['Total'] = df["math"].sum()
print("Get the sum of mathematics column:\n", math_sum)
Yields below output.
# Output:
studentname math science english
r1 Ram 80 85 90
r2 Sam 90 95 85
r3 Scott 85 80 80
r4 Ann 70 90 70
r5 John 95 75 95
Total 420 420 420 420
Use DataFrame.loc[] & DataFrame.sum() Methods
You can use DataFrame.loc[]
and DataFrame.sum()
method to fix the above issue. In this, only the column you are getting sum with has total value and the other will have NaN value.
# Use DataFrame.loc[] & DataFrame.sum() Method
df.loc["Total", "math"] = df.math.sum()
print("Get the sum of mathematics column:\n", math_sum)
Yields below output.
# Output:
studentname math science english
r1 Ram 80.0 85.0 90.0
r2 Sam 90.0 95.0 85.0
r3 Scott 85.0 80.0 80.0
r4 Ann 70.0 90.0 70.0
r5 John 95.0 75.0 95.0
Total NaN 420.0 NaN NaN
Use DataFrame.at[] Method to Get Total of Columns
Alternatively, you can also use DataFrame.at[]
, This gives the same result as above.
# Use DataFrame.at[] method to get total of columns
df.at['Total', "math"] = df["math"].sum()
print(df)
Yields the same output as above.
Use DataFrame.append() Method
You can also use DataFrame.append()
method to get the total of pandas columns added to the DataFrame.
# Use DataFrame.append() method
df2 = df.append(pd.DataFrame(df.math.sum(), index = ["Total"], columns=[ "math"]))
print("Get the sum of mathematics column:\n", math_sum)
Yields below output.
# Output:
studentname math science english
r1 Ram 80 85.0 90.0
r2 Sam 90 95.0 85.0
r3 Scott 85 80.0 80.0
r4 Ann 70 90.0 70.0
r5 John 95 75.0 95.0
Total NaN 420 NaN NaN
Complete Example For Getting Sum of Columns
import pandas as pd
import numpy as np
studentdetails = {
"studentname":["Ram","Sam","Scott","Ann","John"],
"mathematics" :[80,90,85,70,95],
"science" :[85,95,80,90,75],
"english" :[90,85,80,70,95]
}
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(studentdetails ,index=index_labels)
# Use DataFrame.sum() method
df2 = df['math'].sum()
print(df2)
# Using DataFrame.sum() method
df2 = sum(df['math'])
print(df2)
# Use DataFrame.loc[] and pandas.Series() to get total of columns
df.loc['Total'] = pd.Series(df['mathematics'].sum(), index = ['mathematics'])
print(df)
# Get total of columns using DataFrame.loc[] method
df.loc['Total'] = df["math"].sum()
print(df)
# Use DataFrame.loc[] & DataFrame.sum() Method
df.loc["Total", "math"] = df.math.sum()
print(df)
# Use DataFrame.at[] method to get total of columns
df.at['Total', "math"] = df["math"].sum()
print(df)
# Use DataFrame.append() method
df2 = df.append(pd.DataFrame(df.math.sum(), index = ["Total"], columns=[ "math"]))
print(df2)
Conclusion
In this article, you have learned how to get the total of columns by using DataFrame.sum()
, DataFrame.loc[]
,DataFrame.at[]
, DataFrame.append()
and pandas.Series()
for all or given columns with examples.
Happy Learning !!
Related Articles
- Set Column as Index in Pandas DataFrame
- Pandas Sum DataFrame Rows With Examples
- Pandas Sum DataFrame Columns With Examples
- Calculate Summary Statistics in Pandas
- Drop the First Three Rows From Pandas DataFrame
- Pandas Sort by Column Values DataFrame
- Pandas iterate over the columns Of DataFrame
- How to Change the Position of a Column in Pandas
- Set Value to Particular Cell in Pandas DataFrame Using Index
- Difference Between map, applymap, and apply Methods in Pandas