In Pandas pivot table can be used to display the count
values of certain columns. Pandas pivot()
or pivot_table()
function is used to make a spreadsheet-style pivot table from a given DataFrame. In Python Pandas library produces many functions that give us more flexibility to create and analyze the pivot table among all of these pivot()
or pivot_table()
is one that groups, summarize, and aggregates the data. In this article, I will explain how to create count values in a pivot table over column values with examples.
Key Points –
- Utilize the
pivot_table()
function from the Pandas library to create the pivot table, specifying the DataFrame, index, columns, and setting theaggfunc
parameter to'count'
to perform a count aggregation. - Ensure that the DataFrame provided to the pivot table function contains the necessary data fields for analysis, including categorical variables for rows and columns, as well as the data to be counted.
- Load your dataset into a Pandas DataFrame using
pd.read_csv()
or any appropriate method for your data format. Ensure your DataFrame contains the necessary columns for the pivot table analysis. - Use the
pd.pivot_table()
function to create the pivot table. Specify the DataFrame (data
), index (rows), columns, and values parameters. For a count pivot table, set theaggfunc
parameter to'count'
. - Optionally, you can customize the pivot table further by adding additional parameters such as
margins
,fill_value
, ordropna
.
Quick Examples of Getting Count of Pandas Pivot Table
If you are in hurry below are some quick examples of how to get the count of a pivot table.
# Below are the quick examples
# Example 1 : Create count value in a pivot table
p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc = 'count' )
# Example 2 : Create unique count values
p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc=pd.Series.nunique )
Syntax of Pandas Pivot Table
Following is the syntax of the Pandas.pivot_table()
.
# Syntax of Pandas pivot table.
pandas.pivot_table(data, values=None, index=None, columns=None, aggfunc=’mean’, fill_value=None, margins=False, dropna=True, margins_name=’All’, observed=False)
# Another syntax
DataFrame.pivot(index=None, columns=None, values=None)
Parameters of Pivot Table
Below are the parameters of the pivot table
data
– DataFrame object.values
: Are the numeric data in a given DataFrame, that are be aggregated.index
: Defines the rows of the pivot table. Column, Grouper, Array, or list of the previous. If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list).columns
: Defines the columns of the pivot table. Column, Grouper, Array, or list of the previous. If an array is passed, it must be the same length as the data. The list can contain any of the other types (except list).aggfunc
– Function to use for aggregation, default is ‘mean’. Can be ‘sum’, ‘mean’, ‘count’, ‘min’, ‘max’, etc.fill_value
– Value to replace missing values with.margins
– Add all row/columns (e.g. subtotals) (default is False).dropna
– Do not include columns whose entries are all NaN.margins_name
– Name of the row/column that will contain the totals when margins is True.
Return Value
It returns pivot table with count values.
Create Pandas DataFrame
Python pandas is widely used for data science/data analysis and machine learning applications. It is built on top of another popular package named NumPy, which provides scientific computing in Python. pandas DataFrame is a 2-dimensional labeled data structure with rows and columns (columns of potentially different types like integers, strings, float, None, Python objects e.t.c). You can think of it as an excel spreadsheet or SQL table.
We can create pandas DataFrame in many ways here, I will create DataFrame using Python Dictionary.
import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'Gender' : ['Female', 'Male', 'Male', 'Male', 'Female', 'Male', 'Female'],
'Courses': ['Java', 'Spark', 'PySpark','Java','C', 'PySpark', 'Java'],
'Fee': [15000, 17000, 27000, 29000, 12000, 29000, 15000],
'Discount': [1100, 800, 1000, 1600, 600, 1000, 1100]})
print(df)
Yields below output.
# Output:
Gender Courses Fee Discount
0 Female Java 15000 1100
1 Male Spark 17000 800
2 Male PySpark 27000 1000
3 Male Java 29000 1600
4 Female C 12000 600
5 Male PySpark 29000 1000
6 Female Java 15000 1100
Create Pandas Pivot Table with Count Value
Pass count()
statistical function as aggfunc
into a pivot table against column values, it will return the count values of pivot table over a specified column. For example,
# Create count value in a pivot table
p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc = 'count' )
print(p_table)
Yields below output
# Output:
Discount
Courses C Java PySpark Spark
Gender
Female 1.0 2.0 NaN NaN
Male NaN 1.0 2.0 1.0
Create Pandas Pivot Table With Unique Counts
Moreover, we can count the unique presences of a particular observation (row) in a pivot table using aggfunc= pd.Series.nunique
function that will allow us to count only the distinct rows in the DataFrame. For example,
# Create unique count values
p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc=pd.Series.nunique )
print(p_table)
Yields below output.
# Output:
Discount
Courses C Java PySpark Spark
Gender
Female 1.0 1.0 NaN NaN
Male NaN 1.0 1.0 1.0
We can replace the NaN Values with specified values by using the fill_value
parameter of the pivot_table()
function. For example,
p_table = pd.pivot_table(df, index= ['Gender'], columns = ['Courses'], values=['Discount'], aggfunc=pd.Series.nunique , fill_value = '-')
print(p_table)
Yields below output.
# Output:
Discount
Courses C Java PySpark Spark
Gender
Female 1.0 1.0 - -
Male - 1.0 1.0 1.0
Frequently Asked Questions on Create Pandas Pivot Table Count
A Pandas pivot table is a powerful tool used to reshape and summarize data in a tabular format. It allows users to aggregate and summarize data based on one or more columns.
You can create a pivot table in Pandas using the pivot_table()
function. This function requires specifying the DataFrame, index, columns, values, and aggregation function parameters.
To create a pivot table with count values, you can use the aggfunc='count'
parameter in the pivot_table()
function. This will count the occurrences of each category within the specified index and columns.
If you want to count unique values instead of total occurrences, you can use the aggfunc=pd.Series.nunique
parameter. This will count the number of unique values within each category.
You can handle NaN values in the pivot table using the fill_value
parameter. This allows you to specify a value that replaces any NaN values resulting from the pivot table operation.
You can create hierarchical row or column labels in a pivot table by passing a list of column names to the index
or columns
parameter. This creates multi-level row or column labels based on the provided list.
Conclusion
In this article, I have explained how to create Pandas count values in a pivot table over column values with examples. also learned how to unique counts of a column.
Related Articles
- How to Create Pandas Pivot Multiple Columns
- Pandas Pivot Table Explained with Examples
- Pandas groupby() and sum() With Examples
- How to Pandas Pivot Multiple Columns
- Drop Rows From Pandas DataFrame Examples
- Drop Single & Multiple Columns From Pandas DataFrame
- Change the Order of Pandas DataFrame Columns
- How to Get Size of Pandas DataFrame?
- Pandas groupby() and sum() With Examples
- Difference Between loc and iloc in Pandas DataFrame
- How to convert NumPy array to DataFrame?
- Pandas Write to Excel with Examples