Site icon Spark By {Examples}

Pandas DataFrame count() Function

pandas dataframe count

Pandas DataFrame.count() function is used to count the number of non-NA cells for each column or row along with a specified axis. It works with non-floating type data as well. In order to get the row count you should use axis=1 or 'columns' and for column count, you should use axis=0 or 'index' as an argument to the count() function.

In this article, I will explain pandas DataFrame.count() function and using this syntax, parameters of how we can return a number of non-NA cells for each column or row along with a specified axis.

1. Quick Examples of Pandas DataFrame count() Function

If you are in a hurry, below are some quick examples of how to use DataFrame count() function.


# Below are the quick examples

# Example 1: Use dataframe.count() function
df2 = df.count()

# Example 2: Get count of each 
# Pandas dataframe column
df2 = df.count(axis = 0)

# Example 3: Get count of each 
# Pandas dataframe row
df2 =df.count(axis='columns')

# Example 4: Use dataframe.count() function 
# Along axis=1
df2 = df.count(axis = 1)

2. Syntax of Pandas DataFrame.count()

Following is the syntax of the DataFrame.count() function.


# Syntax of DataFrame.count()
DataFrame.count(axis=0, level=None, numeric_only=False)

2.1 Parameters of count()

Following are the parameters of pandas count() function.

2.2 Return value of count()

It returns pandas Series with count values of non-NA cells values or DataFrame if the level is specified.

Now, Let’s create Pandas DataFrame using data from a Python dictionary, where the columns are CoursesFeeDuration and Discount.


# Create DataFrame
import pandas as pd
import numpy as np
technologies= ({
    'Courses':["Spark","PySpark","Hadoop",None,"Python","Pandas"],
    'Courses Fee' :[22000,25000,np.nan,23000,24000,26000],
    'Duration':['30days',np.nan,'50days','30days', None,np.nan],
    'Discount':[1000,None,2300,np.nan,1200,2500]
              })
df = pd.DataFrame(technologies)
print(df)

Yields below output.


# Output:
   Courses  Courses Fee Duration  Discount
0    Spark      22000.0   30days    1000.0
1  PySpark      25000.0      NaN       NaN
2   Hadoop          NaN   50days    2300.0
3     None      23000.0   30days       NaN
4   Python      24000.0     None    1200.0
5   Pandas      26000.0      NaN    2500.0

3. Pandas DataFrame count() Function

You can get the count of each column value other than the NaN values of DataFrame using DataFrame.count() function. In order to get the column values count you should pass axis=0 as an argument into this function. It will ignore all None & nan values and return the count values of each column.

Note that the values None, NaN, NaT, and numpy.inf are considered as NA.


# Use dataframe.count() function
df2 = df.count()
print(df2)

# Get count of each pandas dataframe column
df2 = df.count(axis = 0)
print(df2)

Yields below output.


# Output:
Courses        5
Courses Fee    5
Duration       3
Discount       4
dtype: int64

4. Get Count of Each Pandas DataFrame Row

You can get the count of each row of DataFrame using DataFrame.count() function. In order to get the row count you should use axis='columns' or 1 as an argument to the count() function. Now, let’s run the DatFrame.count() to get the count of each row by ignoring None and Nan values.


# Get count of each pandas dataframe row
df2 =df.count(axis='columns')
print(df2)

# Use dataframe.count() function along axis=1
df2 = df.count(axis = 1)
print(df2)

Yields below output.


# Output:
0    4
1    2
2    3
3    2
4    3
5    3
dtype: int64

5. Complete Example For Pandas DataFrame count() Function


# Complete Example For Pandas DataFrame count() Function
import pandas as pd
import numpy as np
technologies= ({
    'Courses':["Spark","PySpark","Hadoop",None,"Python","Pandas"],
    'Courses Fee' :[22000,25000,np.nan,23000,24000,26000],
    'Duration':['30days',np.nan,'50days','30days', None,np.nan],
    'Discount':[1000,None,2300,np.nan,1200,2500]
              })
df = pd.DataFrame(technologies)
print(df)

# Use dataframe.count() function
df2 = df.count()
print(df2)

# Get count of each pandas dataframe column
df2 = df.count(axis = 0)
print(df2)

# Get count of each pandas dataframe row
df2 =df.count(axis='columns')
print(df2)

# Use dataframe.count() function along axis=1
df2 = df.count(axis = 1)
print(df2)

6. Conclusion

In this article, I have explained the pandas DataFrame count() function to get the count of column values and using this function how we can get the count values of each column or each row of a DataFrame along with a specified axis.

Happy Learning !!

Related Articles

References

Exit mobile version