We can create a Pandas pivot table with multiple columns and return reshaped DataFrame. By manipulating given index or column values we can reshape the data based on column values. Use the pandas.pivot_table
to create a spreadsheet-style pivot table in pandas DataFrame. This function does not support data aggregation, multiple values will result in a Multi-Index in the columns. In this article, I will explain how to create a pivot table with multiple columns.
Quick Examples of Pandas Pivot Table with Multiple Columns
If you are in a hurry, below are some quick examples of how to create pandas pivot tables with multiple columns.
# Quick examples of pandas pivot table with multiple columns
# Example 1: Create a pivot table with a single index
p_table = pd.pivot_table(df, index = ['Gender'])
# Example 2: Create a pivot table with multiple columns
p_table = pd.pivot_table(df, index = ['Gender', 'Courses', 'Fee'])
Syntax of Pivot Table
Following is the syntax of the pivot_table().
# Syntax of pivot table
pivot_table(data, index=None, columns=None, values=None)
# Another Syntax
DataFrame.pivot(index=None, columns=None, values=None)
Parameters of Pivot Table
Below are the parameters of the pivot table.
data
: The DataFrame to pivot.values
: Are the numeric data in a given DataFrame, that are to be aggregated.index
: Defines the rows of the pivot tablecolumns
: Defines the columns of the pivot table
We can create DataFrame in many ways here, I will create Pandas DataFrame using Python Dictionary.
# Create DataFrame
import pandas as pd
df = pd.DataFrame({'Gender' : ['Female', 'Male', 'Male', 'Male', 'Female'],
'Courses': ['Java', 'Spark', 'PySpark', 'C', 'Pandas'],
'Fee': [15000, 17000, 27000, 29000, 12000],
'Discount': [1100, 800, 1000, 1600, 600]})
print("Create DataFrame:\n",df)
Yields below output.
Create the Pivot Table with Multiple Columns
Using the Pandas pivot_table()
function we can reshape the DataFrame on multiple columns in the form of an Excel pivot table. To group the data in a pivot table we will need to pass a DataFrame
into this function and the multiple columns you wanted to group as an index.
Here, I will take a list of values as an index and pass it into pivot_table()
, it will return these values as columns of the pivot table. For example,
# Create pivot table with multiple columns
p_table = pd.pivot_table(df, index = ['Gender', 'Courses', 'Fee'])
print("Create pivot table with multiple columns:\n",p_table)
Yields below output.
Pivot Table with Multiple Aggregation Functions
Aggregate function aggfunc
param takes np.mean()
function as a default function for grouping the data while creating pivot table. The values in the pivot table are the result of the summarization that aggfunc
applies to the feature data.
We can use different aggregate functions, as I mentioned above if it is set to the dictionary, where the keys are aggregated to columns and values are functions or lists of functions. For example,
Here, I have taken np.mean() and np.sum() as a aggfunc
functions.
# Create pivot table using multiple aggfunc
p_table = pd.pivot_table(df, index=['Gender'], aggfunc= {'Fee': 'mean', 'Discount': 'sum'})
print(p_table)
Yields below output.
# Output:
Discount Fee
Gender
Female 1700 13500.000000
Male 3400 24333.333333
Frequently Asked Questions on Create Pandas Pivot Multiple Columns
To create a pivot table with multiple columns in Pandas, you can use the pivot_table
function and specify multiple columns in the columns
parameter.
You can specify multiple aggregation functions for different columns in a Pandas pivot table using the aggfunc
parameter. The aggfunc
parameter allows you to define a dictionary that maps column names to the aggregation functions you want to apply.
When creating a pivot table in Pandas, you can handle missing values using the fill_value
parameter. The fill_value
parameter allows you to specify a value that will be used to fill any missing (NaN) values in the resulting pivot table.
You can reset the index of a Pandas pivot table using the reset_index
method. This method is used to reset the index of the DataFrame, and it returns a new DataFrame with the index reset.
You can filter rows or columns in a Pandas pivot table by using boolean indexing. Boolean indexing allows you to select rows or columns based on a specified condition.
It is possible to rename the columns of a Pandas pivot table using the rename_axis
method. This method allows you to rename the levels of the columns and index of the DataFrame.
Conclusion
In this article, I have explained how to create a Panda pivot table with multiple columns involves using the pivot_table
function with the appropriate parameters. You can customize the pivot table by specifying the columns to be used as the index and columns, selecting the values to aggregate, and defining the aggregation functions with examples.
Related Articles
- How to Create Pandas Pivot Table Count
- Pandas Pivot Table Explained with Examples
- Pandas groupby() and sum() With Examples
- How to Unpivot DataFrame in Pandas?
- Count NaN Values in Pandas DataFrame
- Select pandas columns based on condition
- Drop Rows From Pandas DataFrame Examples
- Change the Order of Pandas DataFrame Columns
- Difference Between loc and iloc in Pandas DataFrame
- Pandas Check Column Contains a Value in DataFrame
- Extract Pandas column value based on another column
- Drop Single & Multiple Columns From Pandas DataFrame