How to get pandas count rows with a condition? To count the number of rows that satisfy single/multiple conditions in pandas DataFrame using shape()
, len()
, df.index
, and apply()
with lambda
functions. In this article, I will explain how to count the number of rows with conditions in DataFrame by using these functions with examples.
Related; You can also get the count of columns in pandas. to count the number of rows that satisfy a specific condition.
Key Points –
- Conditional counting can be performed using boolean indexing to filter rows based on specific criteria.
- The
.shape
attribute can be used to get the number of rows after applying a condition. - Use the
len()
function to count the number of rows after applying a condition. - Use the
count()
method to count non-null values in a specific column based on a condition. - Combine multiple conditions using logical operators like
&
(AND) and|
(OR) to refine row counts. - Utilize the
len()
function to count the number of rows returned by a filtered DataFrame.
Quick Examples of Count Rows with Condition
If you are in a hurry, below are some quick examples of how to get pandas count rows with conditions.
# Quick examples of count rows with condition
# Example 1: Use len() function
# To count rows with a single condition
df2 = len(df[df["Courses"]=="Pandas"])
# Example 2: Use len() function
# To count rows with multiple conditions
df2 = len(df[(df["Courses"]=="Pandas") &
(df["Fee"]==35000)])
# Example 3: Count rows with multiple conditions
df2 = len(df[(df["Courses"]=="Pandas") &
(df["Fee"]==35000) &
(df["Duration"]>= "35days")])
# Example 4: Use Dataframe.apply() & lambda function
df2 = df.apply(lambda x : True
if x['Courses'] == "Spark" else False, axis = 1)
df3 = len(df2[df2 == True].index)
Now, Let’s create Pandas DataFrame using data from a Python dictionary, where the columns are Courses
, Fee
, Duration
and Discount
.
import pandas as pd
import numpy as np
technologies= ({
'Courses':["Spark","PySpark","Hadoop","Pandas","Spark","PySpark", "Pandas"],
'Fee': [22000,25000,30000,35000,22000,25000,35000],
'Duration':['30days','50days','40days','35days','30days','50days','60days'],
'Discount':[1000,2000,2500,1500,1000,2000,1500]
})
index_labels=['r1','r2','r3','r4','r5','r6','r7']
df = pd.DataFrame(technologies,index=index_labels)
print("Create DataFrame:\n", df)
Yields below output.
Pandas len() Function to Count Rows by Condition
To count the number of rows that satisfy the condition, you should use first df[]
to filter the rows and then use the len()
to count the rows after the rows are filtered with the condition. You need to select the “Courses” column in DataFrame to check if any value of the “Courses” column is equal to “Pandas”. When it condition matched len()
function counts the number of rows that contain it.
# Use len() function
# To count rows with single condition
df2 = len(df[df["Courses"]=="Pandas"])
print("Get the rows count with condition:\n", df2)
Yields below output.
Use len() Function to Count Rows with Multiple Conditions
Alternatively, you can also use len()
function to count the rows after filtering rows by multiple conditions in DataFrame. Here, I apply the multiple conditions with"Courses"
column and the "Fee"
column and then get the count after the filter. The condition I use is "Courses"
column checks the values are equal to "Pandas"
. Whereas, the condition on “Fee”
checks the values equal to 35000
.
# Use len() function to count rows with multiple condition
df2 = len(df[(df["Courses"]=="Pandas") &
(df["Fee"]==35000)])
print("Get the rows count of with multiple conditions:\n", df2)
# Output:
# Get the rows count of with multiple conditions:
# 2
# Count rows with multiple condition
df2 = len(df[(df["Courses"]=="Pandas") &
(df["Fee"]==35000) &
(df["Duration"]>= "35days")])
print("Get the rows count of with multiple conditions:\n", df2)
# Output:
# Get the rows count of with multiple conditions:
# 2
Use Dataframe.apply() & Lambda Function
Similarly, pass a lambda expression with conditions into DataFrame.apply()
function to flag the rows that need to be filtered and then apply the len()
to get the count.
# Use Dataframe.apply() & lambda Function
df2 = df.apply(lambda x : True
if x['Courses'] == "Spark" else False, axis = 1)
df3 = len(df2[df2 == True].index)
print("Get the rows count of with condition:\n", df2)
# Output:
# Get the rows count of with condition:
# 2
Complete Example For Count Rows with Condition
import pandas as pd
import numpy as np
technologies= ({
'Courses':["Spark","PySpark","Hadoop","Pandas","Spark","PySpark", "Pandas"],
'Fee': [22000,25000,30000,35000,22000,25000,35000],
'Duration':['30days','50days','40days','35days','30days','50days','60days'],
'Discount':[1000,2000,2500,1500,1000,2000,1500]
})
index_labels=['r1','r2','r3','r4','r5','r6','r7']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
# Pandas count rows dataframe.index
df2 = len(df.index)
print(df2)
# Pandas count rows using len()
df2 = len(df)
print(df2)
# Use len() function to count rows with single condition
df2 = len(df[df["Courses"]=="Pandas"])
print(df2)
# Use len() function to count rows with multiple condition
df2 = len(df[(df["Courses"]=="Pandas") &
(df["Fee"]==35000)])
print(df2)
# Count rows with multiple condition
df2 = len(df[(df["Courses"]=="Pandas") &
(df["Fee"]==35000) &
(df["Duration"]>= "35days")])
print(df2)
# Use Dataframe.apply() & Lambda Function
df2 = df.apply(lambda x : True
if x['Courses'] == "Spark" else False, axis = 1)
df3 = len(df2[df2 == True].index)
print(df3)
Frequently Asked Questions on Pandas Count Rows with Condition
You can use the logical AND (&
) or logical OR (|
) operators to specify the multiple conditions in order to count the number of rows. For example, row_count = ((df['col1'] > col1_value) & (df['col2'] == 'col2_value')).sum()
You can count missing (NaN) values in a specific column using the isna()
function. For example, row_count = df['column'].isna().sum()
Conclusion
In this article, I have explained how to count the number of rows that meet specific single/ multiple conditions in pandas DataFrame using DataFrame.shape()
, len()
, DataFrrame.index
, and Dataframe.apply()
& lambda
function with examples.
Happy Learning !!
Related Articles
- Pandas DataFrame count() Function
- Pandas Count Unique Values in Column
- Count NaN Values in Pandas DataFrame
- How to Create Pandas Pivot Table Count
- Pandas Count Distinct Values DataFrame
- Pandas groupby() and count() with Examples
- Pandas Get Count of Each Row of DataFrame
- How to Count Duplicates in Pandas DataFrame
- Pandas Count The Frequency of a Value in Column
- How to add/insert row to Pandas DataFrame?
- compare two dataframes row by row