Site icon Spark By {Examples}

How to Create Pandas Pivot Table Count

Pandas pivot count

In Pandas pivot table can be used to display the count values of certain columns. Pandas pivot() or pivot_table() function is used to make a spreadsheet-style pivot table from a given DataFrame. In Python Pandas library produces many functions that give us more flexibility to create and analyze the pivot table among all of these pivot() or pivot_table() is one that groups, summarize, and aggregates the data. In this article, I will explain how to create count values in a pivot table over column values with examples.

Key Points –

Quick Examples of Getting Count of Pandas Pivot Table

If you are in hurry below are some quick examples of how to get the count of a pivot table.


# Below are the quick examples

# Example 1 : Create count value in a pivot table
p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc = 'count' )

# Example 2 : Create unique count values
p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc=pd.Series.nunique )

Syntax of Pandas Pivot Table

Following is the syntax of the Pandas.pivot_table().


# Syntax of Pandas pivot table.
pandas.pivot_table(data, values=None, index=None, columns=None, aggfunc=’mean’, fill_value=None, margins=False, dropna=True, margins_name=’All’, observed=False)

# Another syntax
DataFrame.pivot(index=None, columns=None, values=None)

Parameters of Pivot Table

Below are the parameters of the pivot table

Return Value

It returns pivot table with count values.

Create Pandas DataFrame

Python pandas is widely used for data science/data analysis and machine learning applications. It is built on top of another popular package named NumPy, which provides scientific computing in Python. pandas DataFrame is a 2-dimensional labeled data structure with rows and columns (columns of potentially different types like integers, strings, float, None, Python objects e.t.c). You can think of it as an excel spreadsheet or SQL table.

We can create pandas DataFrame in many ways here, I will create DataFrame using Python Dictionary.


import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'Gender' : ['Female', 'Male', 'Male', 'Male', 'Female', 'Male', 'Female'],
                  'Courses': ['Java', 'Spark', 'PySpark','Java','C', 'PySpark', 'Java'],
                   'Fee': [15000, 17000, 27000, 29000, 12000, 29000, 15000],
                   'Discount': [1100, 800, 1000, 1600, 600, 1000, 1100]})
print(df)

Yields below output.


# Output:
   Gender  Courses    Fee  Discount
0  Female     Java  15000      1100
1    Male    Spark  17000       800
2    Male  PySpark  27000      1000
3    Male     Java  29000      1600
4  Female        C  12000       600
5    Male  PySpark  29000      1000
6  Female     Java  15000      1100

Create Pandas Pivot Table with Count Value

Pass count() statistical function as aggfunc into a pivot table against column values, it will return the count values of pivot table over a specified column. For example,


# Create count value in a pivot table
p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc = 'count' )
print(p_table)

Yields below output


# Output:
        Discount                   
Courses        C Java PySpark Spark
Gender                             
Female       1.0  2.0     NaN   NaN
Male         NaN  1.0     2.0   1.0

Create Pandas Pivot Table With Unique Counts

Moreover, we can count the unique presences of a particular observation (row) in a pivot table using aggfunc= pd.Series.nunique function that will allow us to count only the distinct rows in the DataFrame. For example,


# Create unique count values
p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc=pd.Series.nunique )
print(p_table)

Yields below output.


# Output:
       Discount                   
Courses        C Java PySpark Spark
Gender                             
Female       1.0  1.0     NaN   NaN
Male         NaN  1.0     1.0   1.0

We can replace the NaN Values with specified values by using the fill_value parameter of the pivot_table() function. For example,


p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc=pd.Series.nunique , fill_value = '-')
print(p_table)

Yields below output.


# Output:
        Discount                   
Courses        C Java PySpark Spark
Gender                             
Female       1.0  1.0       -     -
Male           -  1.0     1.0   1.0

Frequently Asked Questions on Create Pandas Pivot Table Count

What is a Pandas pivot table?

A Pandas pivot table is a powerful tool used to reshape and summarize data in a tabular format. It allows users to aggregate and summarize data based on one or more columns.

How do I create a pivot table in Pandas?

You can create a pivot table in Pandas using the pivot_table() function. This function requires specifying the DataFrame, index, columns, values, and aggregation function parameters.

How do I create a pivot table with count values in Pandas?

To create a pivot table with count values, you can use the aggfunc='count' parameter in the pivot_table() function. This will count the occurrences of each category within the specified index and columns.

What if I want to count unique values instead of total occurrences?

If you want to count unique values instead of total occurrences, you can use the aggfunc=pd.Series.nunique parameter. This will count the number of unique values within each category.

How do I handle NaN values in the pivot table?

You can handle NaN values in the pivot table using the fill_value parameter. This allows you to specify a value that replaces any NaN values resulting from the pivot table operation.

Is it possible to create hierarchical row or column labels in a pivot table?

You can create hierarchical row or column labels in a pivot table by passing a list of column names to the index or columns parameter. This creates multi-level row or column labels based on the provided list.

Conclusion

In this article, I have explained how to create Pandas count values in a pivot table over column values with examples. also learned how to unique counts of a column.

References

Exit mobile version