How to get pandas count rows with a condition? To get the count rows with a single condition and multiple conditions in pandas DataFrame using either shape()
, len()
, df.index
, and apply()
with lambda
functions. In this article, I will explain how to count the number of rows with conditions in DataFrame by using these functions with examples.
1. Quick Examples of Count Rows with Condition
If you are in a hurry, below are some quick examples of how to get pandas count rows with conditions.
# Below are a quick example
# Example 1: Use len() function
# To count rows with a single condition
df2 = len(df[df["Courses"]=="Pandas"])
# Example 2: Use len() function
# To count rows with multiple conditions
df2 = len(df[(df["Courses"]=="Pandas") &
(df["Fee"]==35000)])
# Example 3: Count rows with multiple conditions
df2 = len(df[(df["Courses"]=="Pandas") &
(df["Fee"]==35000) &
(df["Duration"]>= "35days")])
# Example 4: Use Dataframe.apply() & lambda function
df2 = df.apply(lambda x : True
if x['Courses'] == "Spark" else False, axis = 1)
df3 = len(df2[df2 == True].index)
Now, Let’s create Pandas DataFrame using data from a Python dictionary, where the columns are Courses
, Fee
, Duration
and Discount
.
import pandas as pd
import numpy as np
technologies= ({
'Courses':["Spark","PySpark","Hadoop","Pandas","Spark","PySpark", "Pandas"],
'Fee': [22000,25000,30000,35000,22000,25000,35000],
'Duration':['30days','50days','40days','35days','30days','50days','60days'],
'Discount':[1000,2000,2500,1500,1000,2000,1500]
})
index_labels=['r1','r2','r3','r4','r5','r6','r7']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
Yields below output.
# Output:
Courses Fee Duration Discount
r1 Spark 22000 30days 1000
r2 PySpark 25000 50days 2000
r3 Hadoop 30000 40days 2500
r4 Pandas 35000 35days 1500
r5 Spark 22000 30days 1000
r6 PySpark 25000 50days 2000
r7 Pandas 35000 60days 1500
2. Pandas len() Function to Count Rows by Condition
To get the number of rows to count that matches the condition, you should use first df[] to filter the rows and then us the len() to count the rows after rows are filtered with the condition. You need to select the "Courses"
column in DataFrame to check if any value of the "Courses"
column is equal to "Pandas"
. When it condition matched len()
function counts the number of rows that contains it.
# Use len() function to count rows with single condition
df2 = len(df[df["Courses"]=="Pandas"])
print(df2)
# Output:
# 2
5. Use len() Function to Count Rows with Multiple Conditions
Similarly, you can also use len()
function to count the rows after filtering rows by multiple conditions in DataFrame. Here, I apply the multiple conditions with"Courses"
column and the "Fee"
column and then get the count after the filter. The condition I use is "Courses"
column checks the values are equal to "Pandas"
. Whereas, the condition on “Fee”
checks the values equal to 35000
.
# Use len() function to count rows with multiple condition
df2 = len(df[(df["Courses"]=="Pandas") &
(df["Fee"]==35000)])
print(df2)
# Output:
# 2
# Count rows with multiple condition
df2 = len(df[(df["Courses"]=="Pandas") &
(df["Fee"]==35000) &
(df["Duration"]>= "35days")])
print(df2)
# Output:
# 2
6. Use Dataframe.apply() & Lambda Function
Pass a lambda expression with conditions into Dataframe.apply() function to flag the rows that need to filter and then apply the len() to get the count.
# Use Dataframe.apply() & Lambda Function
df2 = df.apply(lambda x : True
if x['Courses'] == "Spark" else False, axis = 1)
df3 = len(df2[df2 == True].index)
print(df3)
# Output:
# 2
7. Complete Example For Count Rows with Condition
import pandas as pd
import numpy as np
technologies= ({
'Courses':["Spark","PySpark","Hadoop","Pandas","Spark","PySpark", "Pandas"],
'Fee': [22000,25000,30000,35000,22000,25000,35000],
'Duration':['30days','50days','40days','35days','30days','50days','60days'],
'Discount':[1000,2000,2500,1500,1000,2000,1500]
})
index_labels=['r1','r2','r3','r4','r5','r6','r7']
df = pd.DataFrame(technologies,index=index_labels)
print(df)
# Pandas count rows dataframe.index
df2 = len(df.index)
print(df2)
# Pandas count rows using len()
df2 = len(df)
print(df2)
# Use len() function to count rows with single condition
df2 = len(df[df["Courses"]=="Pandas"])
print(df2)
# Use len() function to count rows with multiple condition
df2 = len(df[(df["Courses"]=="Pandas") &
(df["Fee"]==35000)])
print(df2)
# Count rows with multiple condition
df2 = len(df[(df["Courses"]=="Pandas") &
(df["Fee"]==35000) &
(df["Duration"]>= "35days")])
print(df2)
# Use Dataframe.apply() & Lambda Function
df2 = df.apply(lambda x : True
if x['Courses'] == "Spark" else False, axis = 1)
df3 = len(df2[df2 == True].index)
print(df3)
8. Conclusion
In this article, I have explained how to get count rows with single and multiple conditions in pandas DataFrame using DataFrame.shape()
, len()
, DataFrrame.index
, and Dataframe.apply()
& lambda
function with examples.
Happy Learning !!
Related Articles
- How to Get Pandas Columns Count
- Pandas DataFrame count() Function
- Pandas Count Unique Values in Column
- Count NaN Values in Pandas DataFrame
- How to Create Pandas Pivot Table Count
- Pandas Count Distinct Values DataFrame
- Pandas groupby() and count() with Examples
- Pandas Get Count of Each Row of DataFrame
- How to Count Duplicates in Pandas DataFrame
- Pandas Count The Frequency of a Value in Column