How to Create Pandas Pivot Multiple Columns

  • Post author:
  • Post category:Pandas / Python
  • Post last modified:January 24, 2023

We can create a Pandas pivot table with multiple columns and return reshaped DataFrame. By manipulating given index or column values we can reshape the data based on column values. Use the pandas.pivot_table to create a spreadsheet-style pivot table in pandas DataFrame. This function does not support data aggregation, multiple values will result in a Multi-Index in the columns. In this article, I will explain how to create a pivot table with multiple columns.

1. Quick Examples of Pandas Pivot Table with Multiple Columns

If you are in hurry below are some quick examples of pivot tables with multiple columns.


# Below are the quick examples

# Example 1: Create a pivot table with a single index
p_table = pd.pivot_table(df, index = ['Gender'])

# Example 2: Create a pivot table with multiple columns
p_table = pd.pivot_table(df, index = ['Gender', 'Courses', 'Fee'])

2. Syntax of Pivot Table

Following is the syntax of the pivot_table().


# Syntax of pivot table
pivot_table(data, index=None, columns=None, values=None)

# Another Syntax
DataFrame.pivot(index=None, columns=None, values=None)

2.1 Parameters of Pivot Table

Below are the parameters of the pivot table

  • data : Is a DataFrame
  • values : Are the numeric data in a given DataFrame, that are to be aggregated.
  • index : Defines the rows of the pivot table
  • columns : Defines the columns of the pivot table

3. Create Pandas DataFrame

Python pandas is widely used for data science/data analysis and machine learning applications. It is built on top of another popular package named NumPy, which provides scientific computing in Python. pandas DataFrame is a 2-dimensional labeled data structure with rows and columns (columns of potentially different types like integers, strings, float, None, Python objects e.t.c). You can think of it as an excel spreadsheet or SQL table.

We can create DataFrame in many ways here, I will create Pandas DataFrame using Python Dictionary.


import pandas as pd
# Create a DataFrame
df = pd.DataFrame({'Gender' : ['Female', 'Male', 'Male', 'Male', 'Female'],
                  'Courses': ['Java', 'Spark', 'PySpark', 'C', 'Pandas'],
                   'Fee': [15000, 17000, 27000, 29000, 12000],
                   'Discount': [1100, 800, 1000, 1600, 600]})
print(df)

Yields below output.


# Output:
  Gender  Courses    Fee  Discount
0  Female     Java  15000      1100
1    Male    Spark  17000       800
2    Male  PySpark  27000      1000
3    Male        C  29000      1600
4  Female   Pandas  12000       600

4. Create the Pivot Table with Multiple Columns

Using the Pandas pivot_table() function we can reshape the DataFrame on multiple columns in the form of an Excel pivot table. To group the data in a pivot table we will need to pass a DataFrame into this function and the multiple columns you wanted to group as an index.

Here, I will take a list of values as an index and pass it into pivot_table(), it will return these values as columns of the pivot table. For example,


# Create pivot table with multiple columns
p_table = pd.pivot_table(df, index = ['Gender', 'Courses', 'Fee'])
print(p_table)

Yields below output.


# Output:
                       Discount
Gender Courses Fee            
Female Java    15000      1100
       Pandas  12000       600
Male   C       29000      1600
       PySpark 27000      1000
       Spark   17000       800

5. Pivot Table with Multiple Aggregation functions

Aggregate function aggfunc param takes np.mean() function as a default function for grouping the data while creating pivot table. The values in the pivot table are the result of the summarization that aggfunc applies to the feature data.

We can use different aggregate functions, as I mentioned above if it is set to the dictionary, where the keys are aggregated to columns and values are functions or lists of functions. For example,

Here, I have taken np.mean() and np.sum() as a aggfunc functions.


# Create pivot table using multiple aggfunc
p_table = pd.pivot_table(df, index=['Gender'], aggfunc= {'Fee': 'mean', 'Discount': 'sum'})
print(p_table)

Yields below output.


# Output:
         Discount           Fee
Gender                        
Female      1700  13500.000000
Male        3400  24333.333333

6. Conclusion :

In this article, I have explained how to create a Panda pivot table with multiple columns with examples.

References

https://pandas.pydata.org/docs/reference/api/pandas.pivot_table

Leave a Reply

You are currently viewing How to Create Pandas Pivot Multiple Columns