• Post author:
  • Post category:Pandas
  • Post last modified:May 18, 2024
  • Reading time:10 mins read
You are currently viewing How to Unpivot DataFrame in Pandas?

In Pandas, we can unpivot a DataFrame using the melt() method. This method transforms the data from a wide format to a long format. By adjusting the parameters of the melt() function, we can decrease the number of columns and increase the number of rows, resulting in a long-format DataFrame.

Advertisements

In this article, I will explain the melt() function, its syntax, parameters, and usage of how to unpivot the DataFrame in pandas.

Quick Examples of Pandas Unpivot the DataFrame

Following are quick examples of unpivot the DataFrame in pandas.


# Quick examples of pandas unpivot the dataframe

# Example 1: Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, id_vars = 'Student Names', value_vars = ['Courses', 'Fee') 

# Example 2: Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, 
   id_vars = 'Student Names', 
   value_vars = ['Courses', 'Fee'], 
   var_name = 'Course details', 
   value_name = 'Attributes')

# Example 3: Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, 
   id_vars = 'Student Names', 
   value_vars = ['Courses', 'Fee'], 
   var_name = 'Course details', 
   value_name = 'Attributes',
   ignore_index = False)

Pandas melt() Introduction

Following is the syntax of the melt() function.


# Syntax of melt()
pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None, ignore_index=True)

Parameters

Following are the parameters of the melt() method.

  • frame – The DataFrame to unpivot.
  • id_vars – Column(s) to use as identifier variables. These columns will remain fixed while the other columns are unpivoted. This can be a single-column label or a list of column labels.
  • value_vars (optional) – Column(s) to unpivot. If not specified, all columns not set as id_vars will be unpivoted. This can be a single-column label or a list of column labels.
  • var_name (optional) – Name to use for the ‘variable’ column. If None, it uses variable as the default name.
  • value_name (optional) – Name to use for the ‘value’ column. If None, it uses value as the default name.
  • col_level (optional) – If columns are a MultiIndex, this parameter specifies which level to melt.
  • ignore_index (optional) – If True, original index is ignored. If False, the original index is retained. Default is True.

Return Value

It returns unpivoted pandas DataFrame.

Unpivot Pandas DataFrame Using melt()

Pandas melt() function is used to change the shape of the given DataFrame. In which format is changed from wide to long. Using the melt() function in Pandas means that one or more columns are used as identifiers, while the remaining columns are transformed into values. Finally, this process makes the DataFrame unpivot.

Python Pandas is a widely used library for data science, data analysis, and machine learning applications. Built on top of the popular Numpy package, which provides scientific computing in Python, Pandas offers powerful data manipulation capabilities. The Pandas DataFrame is a 2-dimensional labeled data structure with rows and columns, where each column can contain different types of data such as integers, strings, floats, None, and Python objects. You can think of it as an Excel spreadsheet or a SQL table.

To run some examples of how to unpivot DataFrame in Pandas, let’s create Pandas DataFrame using data from a dictionary.


import numpy as np
import pandas as pd
df = pd.DataFrame({'Student Names' : ['Jenny', 'Singh', 'Charles', 'Richard', 'Veena'],
                  'Courses': ['Java', 'Spark', 'PySpark','Hadoop','C'],
                   'Fee': [15000, 17000, 27000, 29000, 12000],
                   'Discount': [1100, 800, 1000, 1600, 600]})
print(df)

Yields below output.


# Output:
  Student Names  Courses    Fee  Discount
0         Jenny     Java  15000      1100
1         Singh    Spark  17000       800
2       Charles  PySpark  27000      1000
3       Richard   Hadoop  29000      1600
4         Veena        C  12000       600

To unpivot data using the Pandas melt() function, you need to specify the id_vars parameter for the columns you want to keep fixed and the value_vars parameter for the columns you want to unpivot. This function will return the DataFrame in long format.


# Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, id_vars = 'Student Names', value_vars = ['Courses', 'Fee') 
print(un_pivot)

Yields below output.


# Output:
   Student Names variable    value
0         Jenny  Courses     Java
1         Singh  Courses    Spark
2       Charles  Courses  PySpark
3       Richard  Courses   Hadoop
4         Veena  Courses        C
5         Jenny      Fee    15000
6         Singh      Fee    17000
7       Charles      Fee    27000
8       Richard      Fee    29000
9         Veena      Fee    12000

In the above example, variable will contain the names of the columns specified in value_vars, and value will contain the corresponding values. However, you can customize these names by explicitly specifying var_name and value_name if needed.

Unpivot Pandas DataFrame using var_name & value_name

You can use the var_name and value_name parameters in the melt() function to customize the names of the variable and value columns in the unpivoted DataFrame.


# Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, 
   id_vars = 'Student Names', 
   value_vars = ['Courses', 'Fee'], 
   var_name = 'Course details', 
   value_name = 'Attributes'
)
print(un_pivot)

Yields below output.


# Output:
 Student Names Course details Attributes
0         Jenny        Courses       Java
1         Singh        Courses      Spark
2       Charles        Courses    PySpark
3       Richard        Courses     Hadoop
4         Veena        Courses          C
5         Jenny            Fee      15000
6         Singh            Fee      17000
7       Charles            Fee      27000
8       Richard            Fee      29000
9         Veena            Fee      12000

Unpivot Pandas DataFrame using ignore_index

The ignore_index parameter in the melt() function determines whether to ignore the original index of the DataFrame. By default, it’s set to True, which means the resulting DataFrame will have a new index range starting from 0. However, if you set it to False, the resulting DataFrame will retain the original index values.


# Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, 
   id_vars = 'Student Names', 
   value_vars = ['Courses', 'Fee'], 
   var_name = 'Course details', 
   value_name = 'Attributes',
   ignore_index = False
)
print(un_pivot)

Yields below output.


# Output:
 Student Names Course details Attributes
0         Jenny        Courses       Java
1         Singh        Courses      Spark
2       Charles        Courses    PySpark
3       Richard        Courses     Hadoop
4         Veena        Courses          C
0         Jenny            Fee      15000
1         Singh            Fee      17000
2       Charles            Fee      27000
3       Richard            Fee      29000
4         Veena            Fee      12000

Conclusion

In conclusion, the Pandas melt() function is a powerful tool for reshaping or unpivoting DataFrames, allowing you to transform data from a wide format to a long format. By using various parameters such as id_vars, value_vars, var_name, value_name, and ignore_index, you can customize the unpivoting process to suit your specific data manipulation needs.

References