• Post author:
  • Post category:Pandas
  • Post last modified:May 26, 2024
  • Reading time:13 mins read
You are currently viewing How to Create Pandas Pivot Table Count

In Pandas, a pivot table can be used to display the count values of specific columns. The pivot() or pivot_table() methods are used to create a spreadsheet-style pivot table from a given DataFrame. In Python Pandas library produces many functions that give us more flexibility to create and analyze the pivot table among all of these pivot() or pivot_table() is one that groups, summarize, and aggregates the data. In this article, I will explain how to create count values in a pivot table over column values.

Advertisements

Key Points –

  • Utilize the pivot_table() function from the Pandas library to create the pivot table, specifying the DataFrame, index, columns, and setting the aggfunc parameter to count to perform a count aggregation.
  • Ensure that the DataFrame provided to the pivot table function contains the necessary data fields for analysis, including categorical variables for rows and columns, as well as the data to be counted.
  • Load your dataset into a Pandas DataFrame using pd.read_csv() or any appropriate method for your data format. Ensure your DataFrame contains the necessary columns for the pivot table analysis.
  • Use the pd.pivot_table() function to create the pivot table. Specify the DataFrame (data), index (rows), columns, and values parameters. For a count pivot table, set the aggfunc parameter to count.
  • Optionally, you can customize the pivot table further by adding additional parameters such as margins, fill_value, or dropna.

Quick Examples of Getting Count of Pandas Pivot Table

Following are quick examples of getting the count of a pivot table.


# Quick examples of getting count of pivot table

# Example 1 : Create count value in a pivot table
p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc = 'count' )

# Example 2 : Create unique count values
p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc=pd.Series.nunique )

Pandas pivot_table() Introduction

Following is the syntax of the Pandas.pivot_table().


# Syntax of Pandas pivot table.
pandas.pivot_table(data, values=None, index=None, columns=None, aggfunc=’mean’, fill_value=None, margins=False, dropna=True, margins_name=’All’, observed=False)

# Another syntax
DataFrame.pivot(index=None, columns=None, values=None)

Parameters

Following are the parameters of the pivot_table()

  • data – The DataFrame object is to be pivoted.
  • values – Column(s) to aggregate. If not specified, all remaining numeric columns will be aggregated.
  • index – column, Grouper, array, or list of the previous. Keys to group by on the pivot table index. If an array is passed, it must be the same length as the data. Lists of strings/arrays are passed as the keys.
  • columns – Defines the columns of the pivot table. Column, Grouper, Array, or list of the previous. If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list).
  • aggfunc – Function to use for aggregation, default is ‘mean’. Can be ‘sum’, ‘mean’, ‘count’, ‘min’, ‘max’, etc.
  • fill_value – Value to replace missing values with.
  • margins – Add all row/columns (e.g. subtotals) (default is False).
  • dropna – Do not include columns whose entries are all NaN.
  • margins_name – Name of the row/column that will contain the totals when margins is True.

Return Value

It returns pivot table with count values.

Create Pandas DataFrame

Python pandas is extensively used for data science, data analysis, and machine learning applications. It is built on top of another popular package called NumPy, which provides scientific computing capabilities in Python. A pandas DataFrame is a 2-dimensional labeled data structure with rows and columns, where columns can be of different data types such as integers, strings, floats, None, or Python objects. You can think of it as similar to an Excel spreadsheet or a SQL table.

To run some examples of creating pandas pivot table count, let’s create a Pandas DataFrame using data from a dictionary.


import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'Gender' : ['Female', 'Male', 'Male', 'Male', 'Female', 'Male', 'Female'],
                  'Courses': ['Java', 'Spark', 'PySpark','Java','C', 'PySpark', 'Java'],
                   'Fee': [15000, 17000, 27000, 29000, 12000, 29000, 15000],
                   'Discount': [1100, 800, 1000, 1600, 600, 1000, 1100]})
print(df)

Yields below output.


# Output:
   Gender  Courses    Fee  Discount
0  Female     Java  15000      1100
1    Male    Spark  17000       800
2    Male  PySpark  27000      1000
3    Male     Java  29000      1600
4  Female        C  12000       600
5    Male  PySpark  29000      1000
6  Female     Java  15000      1100

Create Count Value in a Pandas Pivot Table

Pass count() statistical function as aggfunc into a pivot table against column values, it will return the count values of pivot table over a specified column. For example,


# Create count value in a pivot table
p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc = 'count' )
print(p_table)

# Output:
#         Discount                   
# Courses        C Java PySpark Spark
# Gender                             
# Female       1.0  2.0     NaN   NaN
# Male         NaN  1.0     2.0   1.0

Create Pandas Pivot Table With Unique Counts

Moreover, we can count the unique presences of a particular observation (row) in a pivot table using aggfunc= pd.Series.nunique function that will allow us to count only the distinct rows in the DataFrame. For example,


# Create unique count values
p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc=pd.Series.nunique )
print(p_table)

# Output:
#        Discount                   
# Courses        C Java PySpark Spark
# Gender                             
# Female       1.0  1.0     NaN   NaN
# Male         NaN  1.0     1.0   1.0

We can replace the NaN Values with specified values by using the fill_value parameter of the pivot_table() function. For example,


p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc=pd.Series.nunique , fill_value = '-')
print(p_table)

# Output:
#         Discount                   
# Courses        C Java PySpark Spark
# Gender                             
# Female       1.0  1.0       -     -
# Male           -  1.0     1.0   1.0

FAQ on Count of Pivot Table

What is a Pandas pivot table?

A Pandas pivot table is a powerful tool used to reshape and summarize data in a tabular format. It allows users to aggregate and summarize data based on one or more columns.

How do I create a pivot table with count values in Pandas?

To create a pivot table with count values, you can use the aggfunc='count' parameter in the pivot_table() function. This will count the occurrences of each category within the specified index and columns.

What if I want to count unique values instead of total occurrences?

If you want to count unique values instead of total occurrences, you can use the aggfunc=pd.Series.nunique parameter. This will count the number of unique values within each category.

Is it possible to create hierarchical row or column labels in a pivot table?

To create hierarchical row or column labels in a pivot table by passing a list of column names to the index or columns parameter. This creates multi-level row or column labels based on the provided list.

Conclusion

In this article, you have learned to create Pandas count values in a pivot table over column values with examples. also learned how to unique counts of a column.

References