We can create a Pandas pivot table with multiple columns and return reshaped DataFrame. By manipulating given index or column values we can reshape the data based on column values. Use the pandas.pivot_table
to create a spreadsheet-style pivot table in pandas DataFrame. This function does not support data aggregation, multiple values will result in a Multi-Index in the columns. In this article, I will explain how to create a pivot table with multiple columns.
1. Quick Examples of Pandas Pivot Table with Multiple Columns
If you are in hurry below are some quick examples of pivot tables with multiple columns.
# Below are the quick examples
# Example 1: Create a pivot table with a single index
p_table = pd.pivot_table(df, index = ['Gender'])
# Example 2: Create a pivot table with multiple columns
p_table = pd.pivot_table(df, index = ['Gender', 'Courses', 'Fee'])
2. Syntax of Pivot Table
Following is the syntax of the pivot_table().
# Syntax of pivot table
pivot_table(data, index=None, columns=None, values=None)
# Another Syntax
DataFrame.pivot(index=None, columns=None, values=None)
2.1 Parameters of Pivot Table
Below are the parameters of the pivot table
data
: Is a DataFramevalues
: Are the numeric data in a given DataFrame, that are to be aggregated.index
: Defines the rows of the pivot tablecolumns
: Defines the columns of the pivot table
3. Create Pandas DataFrame
Python pandas is widely used for data science/data analysis and machine learning applications. It is built on top of another popular package named NumPy, which provides scientific computing in Python. pandas DataFrame is a 2-dimensional labeled data structure with rows and columns (columns of potentially different types like integers, strings, float, None, Python objects e.t.c). You can think of it as an excel spreadsheet or SQL table.
We can create DataFrame in many ways here, I will create Pandas DataFrame using Python Dictionary.
import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'Gender' : ['Female', 'Male', 'Male', 'Male', 'Female'],
'Courses': ['Java', 'Spark', 'PySpark', 'C', 'Pandas'],
'Fee': [15000, 17000, 27000, 29000, 12000],
'Discount': [1100, 800, 1000, 1600, 600]})
print(df)
Yields below output.
# Output:
Gender Courses Fee Discount
0 Female Java 15000 1100
1 Male Spark 17000 800
2 Male PySpark 27000 1000
3 Male C 29000 1600
4 Female Pandas 12000 600
4. Create the Pivot Table with Multiple Columns
Using the Pandas pivot_table() function we can reshape the DataFrame on multiple columns in the form of an Excel pivot table. To group the data in a pivot table we will need to pass a DataFrame
into this function and the multiple columns you wanted to group as an index.
Here, I will take a list of values as an index and pass it into pivot_table(), it will return these values as columns of the pivot table. For example,
# Create pivot table with multiple columns
p_table = pd.pivot_table(df, index = ['Gender', 'Courses', 'Fee'])
print(p_table)
Yields below output.
# Output:
Discount
Gender Courses Fee
Female Java 15000 1100
Pandas 12000 600
Male C 29000 1600
PySpark 27000 1000
Spark 17000 800
5. Pivot Table with Multiple Aggregation functions
Aggregate function aggfunc
param takes np.mean()
function as a default function for grouping the data while creating pivot table. The values in the pivot table are the result of the summarization that aggfunc
applies to the feature data.
We can use different aggregate functions, as I mentioned above if it is set to the dictionary, where the keys are aggregated to columns and values are functions or lists of functions. For example,
Here, I have taken np.mean() and np.sum() as a aggfunc
functions.
# Create pivot table using multiple aggfunc
p_table = pd.pivot_table(df, index=['Gender'], aggfunc= {'Fee': 'mean', 'Discount': 'sum'})
print(p_table)
Yields below output.
# Output:
Discount Fee
Gender
Female 1700 13500.000000
Male 3400 24333.333333
6. Conclusion :
In this article, I have explained how to create a Panda pivot table with multiple columns with examples.
Related Articles
- Drop Rows From Pandas DataFrame Examples
- Drop Single & Multiple Columns From Pandas DataFrame
- Change the Order of Pandas DataFrame Columns
- Pandas groupby() and sum() With Examples
- Difference Between loc and iloc in Pandas DataFrame
- How to Create Pandas Pivot Table Count
- Pandas Pivot Table Explained with Examples
- How to Unpivot DataFrame in Pandas?
- Count NaN Values in Pandas DataFrame
- Select pandas columns based on condition
References
https://pandas.pydata.org/docs/reference/api/pandas.pivot_table