• Post author:
  • Post category:Pandas
  • Post last modified:November 20, 2024
  • Reading time:15 mins read
You are currently viewing Pandas DataFrame count() Function

Pandas DataFrame.count() function is used to count the number of non-NA cells for each column or row along with a specified axis. It works with non-floating type data as well. In order to get the row count you should use axis=1 or 'columns' and for column count, you should use axis=0 or 'index' as an argument to the count() function.

Advertisements

In this article, I will explain pandas DataFrame.count() function and using this syntax, parameters of how we can return a number of non-NA cells for each column or row along with a specified axis.

Key Points –

  • It returns the number of non-null (non-NaN) values in each column or row of a DataFrame.
  • By default, it counts non-null values along columns (axis=0).
  • You can count non-null values across rows by setting axis=1.
  • It automatically excludes NaN or None values from the count.
  • The function works with both numeric and non-numeric data types, but can be restricted to numeric types using numeric_only=True.

Quick Examples of Pandas DataFrame count() Function

If you are in a hurry, below are some quick examples of how to use DataFrame count() function.


# Quick examples of Pandas DataFrame count()

# Example 1: Use dataframe.count() function
df2 = df.count()

# Example 2: Get count of each 
# Pandas dataframe column
df2 = df.count(axis = 0)

# Example 3: Get count of each 
# Pandas dataframe row
df2 =df.count(axis='columns')

# Example 4: Use dataframe.count() function 
# Along axis=1
df2 = df.count(axis = 1)

count() Function in Pandas

The count() function in Pandas is used to count the number of non-missing or non-NA/null entries in each column or row of a DataFrame or Series. It excludes NaN (Not a Number) values by default. This function is particularly useful when you want to quickly get a sense of how much valid (non-null) data is available in your dataset for each column or row.

Syntax of Pandas DataFrame.count()

Following is the syntax of the DataFrame.count() function.


# Syntax of DataFrame.count()
DataFrame.count(axis=0, level=None, numeric_only=False)

Parameters of count()

Following are the parameters of pandas count() function.

  • axis – {0 or ‘index’, 1 or ‘columns’}, default value zero, If 0 or ‘index’ is used row-wise, whereas 1 or ‘columns’ is used column-wise.
  • level – int or str: This is an optional parameter. If an axis is MultiIndex (hierarchical), it counts along with the particular level and, collapses into the DataFrame.
  • numeric_only – bool, default value False: It only Includes float, int, or boolean data.

Return value

It returns pandas Series with count values of non-NA cells values or DataFrame if the level is specified.

Usage of Pandas DataFrame count() Function

The count() function in Pandas is used to count the number of non-null values in each column or row of a DataFrame.

Now, Let’s create Pandas DataFrame using data from a Python dictionary, where the columns are CoursesFeeDuration and Discount.


# Create DataFrame
import pandas as pd
import numpy as np
technologies= ({
    'Courses':["Spark","PySpark","Hadoop",None,"Python","Pandas"],
    'Courses Fee' :[22000,25000,np.nan,23000,24000,26000],
    'Duration':['30days',np.nan,'50days','30days', None,np.nan],
    'Discount':[1000,None,2300,np.nan,1200,2500]
              })
df = pd.DataFrame(technologies)
print(df)

Yields below output.


# Output:
   Courses  Courses Fee Duration  Discount
0    Spark      22000.0   30days    1000.0
1  PySpark      25000.0      NaN       NaN
2   Hadoop          NaN   50days    2300.0
3     None      23000.0   30days       NaN
4   Python      24000.0     None    1200.0
5   Pandas      26000.0      NaN    2500.0

Pandas DataFrame count() Function

You can get the count of each column value other than the NaN values of DataFrame using DataFrame.count() function. In order to get the column values count you should pass axis=0 as an argument into this function. It will ignore all None & nan values and return the count values of each column.

Note that the values None, NaN, NaT, and numpy.inf are considered as NA.


# Use dataframe.count() function
df2 = df.count()
print(df2)

# Get count of each pandas dataframe column
df2 = df.count(axis = 0)
print(df2)

Yields below output.


# Output:
Courses        5
Courses Fee    5
Duration       3
Discount       4
dtype: int64

Get Count of Each Pandas DataFrame Row

You can get the count of each row of DataFrame using DataFrame.count() function. In order to get the row count you should use axis='columns' or 1 as an argument to the count() function. Now, let’s run the DatFrame.count() to get the count of each row by ignoring None and Nan values.


# Get count of each pandas dataframe row
df2 =df.count(axis='columns')
print(df2)

# Use dataframe.count() function along axis=1
df2 = df.count(axis = 1)
print(df2)

Yields below output.


# Output:
0    4
1    2
2    3
3    2
4    3
5    3
dtype: int64

Complete Example For Pandas DataFrame count() Function


# Complete Example For Pandas DataFrame count() Function
import pandas as pd
import numpy as np
technologies= ({
    'Courses':["Spark","PySpark","Hadoop",None,"Python","Pandas"],
    'Courses Fee' :[22000,25000,np.nan,23000,24000,26000],
    'Duration':['30days',np.nan,'50days','30days', None,np.nan],
    'Discount':[1000,None,2300,np.nan,1200,2500]
              })
df = pd.DataFrame(technologies)
print(df)

# Use dataframe.count() function
df2 = df.count()
print(df2)

# Get count of each pandas dataframe column
df2 = df.count(axis = 0)
print(df2)

# Get count of each pandas dataframe row
df2 =df.count(axis='columns')
print(df2)

# Use dataframe.count() function along axis=1
df2 = df.count(axis = 1)
print(df2)

FAQ on Pandas DataFrame count() Function

What is the purpose of the count() function in Pandas?

The count() function is used to count the number of non-null/NaN entries in each column or row of a DataFrame.

How does the count() function handle missing values?

The count() function only counts non-null/NaN values. If a column has missing data, those entries are excluded from the count.

Can the count() function be applied to rows instead of columns?

By setting the axis parameter. Use axis=0 (default) to count along columns, or axis=1 to count along rows.

What does the count() function return when used on a DataFrame?

The count() function returns a Series that contains the count of non-null entries for each column (or row, if axis=1 is set).

Is it possible to count non-null entries only for specific columns?

You can count entries for specific columns by selecting those columns first and then applying the count() function.

How is the count() function different from the size attribute?

The count() function excludes NaN values, whereas the size attribute returns the total number of entries, including NaNs, in the DataFrame or Series.

Conclusion

In this article, I have explained the pandas DataFrame count() function to get the count of column values and using this function how we can get the count values of each column or each row of a DataFrame along with a specified axis.

Happy Learning !!

Related Articles

References