Site icon Spark By {Examples}

Pandas Get Count of Each Row of DataFrame

pandas each row count

In Pandas, You can get the count of each row of DataFrame using DataFrame.count() method. In order to get the row count you should use axis='columns' as an argument to the count() method. Note that the count() method ignores all None & nan values from the count.

Key Points –

Syntax of df.count()

Following is the syntax of df.count().


# Syntax of df.count()
df.count(axis='columns')

Now let’s create a DataFrame, run these, and explore the output. Our DataFrame contains just two columns CoursesCourses Fee, Duration, and Discount.


import pandas as pd
import numpy as np
technologies= {
    'Courses':["Spark","PySpark","Hadoop","Python","Pandas"],
    'Courses Fee' :[22000,25000,23000,24000,26000],
    'Duration':['30days','50days','30days', None,np.nan],
    'Discount':[1000,2300,1000,1200,2500]
          }
df = pd.DataFrame(technologies)
print(df)

Yields below output.


# Output:
   Courses  Courses Fee Duration  Discount
0    Spark        22000   30days      1000
1  PySpark        25000   50days      2300
2   Hadoop        23000   30days      1000
3   Python        24000     None      1200
4   Pandas        26000      NaN      2500

Pandas Get Count of Each DataFrame Row

Now, let’s run the DatFrame.count() to get the count of each row by ignoring None and Nan values. For instance,
the count() method in Pandas can be used to count the number of non-null values along a specified axis. If you’re interested in counting the non-null values in each row, you would use axis=1 or axis='columns'. However, the correct usage is with axis=1 rather than axis='columns'.


# Get count of each dataframe row 
df2 = df.count(axis='columns')
print(df2)

Yields below output. Note that Rows 3 and 4 are 3 as these two rows have None or Nan values.


# Output:
0    4
1    4
2    4
3    3
4    3

Similarly, you can get the count of non-null values in each row of a DataFrame using Pandas. This will give you a Series containing the count of non-null values in each row of the DataFrame df.


# Get count of each DataFrame row
row_counts = df.count(axis=1)
print(row_counts)

In the above example, df.count(axis=1) is used to count the number of non-null values in each row of the DataFrame df, and the resulting counts are stored in the row_counts Series. Yields the same output as above.

Frequently Asked Questions on Get Count of Each Row of DataFrame

What does count(axis=1) do in Pandas?

The count(axis=1) method in Pandas counts the number of non-null values in each row of a DataFrame along the specified axis.

How do I count non-null values in each row of a DataFrame?

You can use the count(axis=1) method in Pandas. It returns a Series containing the count of non-null values for each row.

How can I handle missing values while counting each row in a DataFrame?

Pandas automatically handles missing values (NaN) when counting non-null values in each row using the count(axis=1) method. It ignores NaN values during the count.

What is the performance impact of using count(axis=1) on large DataFrames?

The count(axis=1) method in Pandas is designed to be efficient, especially for large datasets, as it leverages vectorized operations, making it suitable for performance-critical tasks.

Can I customize the counting process for specific requirements?

While count(axis=1) provides a straightforward way to count non-null values in each row, you can customize the counting process further by combining it with other Pandas methods or functions based on specific requirements

References

Exit mobile version