To sum all Pandas DataFrame rows or given selected rows use the sum()
function. The Pandas DataFrame.sum()
function returns the sum of the values for the requested axis, In order to calculate the sum of rows use the default param, axis=0
and to get the sum of columns use axis=1
. In this article, I will explain how to sum pandas DataFrame for given rows with examples.
Key Points –
- The
.sum()
function is used to sum elements along a specific axis (rows or columns). - By default,
.sum(axis=0)
calculates the sum for each column. - To sum across rows, use
.sum(axis=1)
. - The method can be applied to both numeric and non-numeric data, but non-numeric data will be ignored.
- Aggregating functions like
.sum()
can be used with other DataFrame methods for filtering and grouping. - The function can be applied to a subset of rows or columns by selecting the desired part of the DataFrame.
Quick Examples of Sum DataFrame Rows
If you are in a hurry, below are some quick examples of how to sum pandas DataFrame by given or all rows.
# Quick examples of sum dataframe rows
# Example 1: Using sum()
# To Sum the rows of each column
df1 = df.sum()
# Example 2: Get sum of all rows as a new row in Dataframe
sum = df.sum()
sum.name = 'Sum'
# Assign sum of all rows of DataFrame as a new row
df = df.append(sum.transpose())
# Example 3: Get sum of first 2 rows of DataFrame
sum = df.iloc[0:2].sum()
# Example 4: Get sum of 3 rows (selected by index labels)
sum = df.loc[['r1', 'r3', 'r4']].sum()
Now, let’s create a DataFrame with a few rows and columns, execute these examples, and validate the results. Our DataFrame contains column names studentname
, mathematics
, science
and english
.
# Create DataFrame
import pandas as pd
studentdetails = {
"Studentname":["Ram", "Sam", "Scott", "Ann", "John"],
"Mathematics" :[80,90,85,70,95],
"Science" :[85,95,80,90,75],
"English" :[90,85,80,70,95]
}
index_labels=['r1','r2','r3','r4','r5']
df = pd.DataFrame(studentdetails ,index=index_labels)
print("Create DataFrame:\n", df)
Yield below output.
Using DataFrame.sum() to Sum All Rows
Use DataFrame.sum()
to get the sum/total of a Pandas DataFrame for both rows and columns. By default, this function takes axis=0
and adds all the rows of each column and returns the Pandas Series where the values are the sum of all rows over the columns. If we pass the axis
param as '1'
to this function, we can get a sum of all columns.
# Using sum() to Sum the rows of each column
df1 = df.sum()
print("Get sum of all rows in a DataFrame:\n", df1)
Yields below output. Note that for string columns, it just concatenates the values from columns. From our example Studentname
is a string column.
Add the Sum of Rows as an Index of Pandas DataFrame
If you notice the above output, the actual row values that are part of the sum are not returned by the DataFrame.sum()
function however, you can get all rows including the sum row by assigning this function to a DataFrame row. Let’s add a row 'Sum'
which is the sum of rows for each column. We can add this row to the DataFrame with the help of the pandas.append() and pd.transpose()
functions.
# Get sum of all rows as a new row in Dataframe
sum = df.sum()
sum.name = 'Sum'
# Assign sum of all rows of DataFrame as a new row
df = df.append(sum.transpose())
print("Add sum column to DataFrame:\n", df)
Yields below output. Here, series.name
it is used to set a name to the index.
# Output:
# Add sum column to DataFrame:
Studentname Mathematics Science English
r1 Ram 80 85 90
r2 Sam 90 95 85
r3 Scott 85 80 80
r4 Ann 70 90 70
r5 John 95 75 95
Sum RamSamScottAnnJohn 420 425 420
As we can see from the above, the sum row has been added to the Pandas DataFrame with index sum.
Pandas Sum Specified Rows using iloc[]
We can also calculate the sum for the specified multiple rows of the DataFrame using the index range of the DataFrame.iloc[] property. This property will select a specified portion of rows and add them using the sum()
function. Then, we will get the sum of specified rows in the form of a Series.
# Get sum of first 2 rows of DataFrame
sum = df.iloc[0:2].sum()
print("Get sum of specified rows:\n", sum)
Yields below output.
# Output:
# Get sum of specified rows:
Studentname RamSam
Mathematics 170
Science 180
English 175
dtype: object
Pandas Sum Specified Rows using loc[]
By using DataFrame.loc[]
function, select the rows by labels, and then use the sum()
function to calculate the sum of rows. Let’s use the loc[]
attribute and select specified rows then call the sum()
function, this syntax will return the sum of specified rows in the form of a Series.
# Get sum of 3 DataFrame rows (selected by index labels)
sum = df.loc[['r1', 'r3', 'r4']].sum()
print("Get sum of specified rows:\n", sum)
Yields below output.
# Output:
# Get sum of specified rows:
Studentname RamScottAnn
Mathematics 235
Science 255
English 240
dtype: object
FAQ on Pandas Sum DataFrame Rows
To sum all rows in a Pandas DataFrame, you can use the sum()
function with the axis=0
parameter (which sums columns vertically by default). If you want the sum of all values across the entire DataFrame, you’ll need to sum across all columns first, and then sum the result.
You can sum rows for specific columns in a Pandas DataFrame by selecting the desired columns and using the sum()
function with the axis=1
parameter.
By default, sum()
skips NaN
values. If you want to include them and return NaN
when any value in the row is NaN
, use skipna=False
.
You can sum rows conditionally in a Pandas DataFrame by using boolean indexing and the sum()
function.
To sum only the numeric columns in each row, you can use the select_dtypes()
method to filter numeric columns.
Conclusion
In this article, I have explained how to sum all Pandas DataFrame rows over the columns using the sum()
function and also explained how to add Pandas rows for only selected rows using iloc[]
and loc[]
attributes with several well-defined examples.