• Post author:
  • Post category:Pandas
  • Post last modified:March 27, 2024
  • Reading time:9 mins read
You are currently viewing How to Unpivot DataFrame in Pandas?

In pandas we can unpivot a given DataFrame columns by using melt() method, where the format of the data changes from a wide format to a long format. Using melt() function parameters we can reduce the columns and increase the rows of the given DataFrame, this will give a long format of the DataFrame. In this article, I will explain how to unpivot the DataFrame in pandas using the melt() method and its syntax and parameters with examples.

Advertisements

1. Quick Examples of Pandas Unpivot the DataFrame

If you are in hurry below are some quick examples of unpivot the DataFrame in pandas.


# Below are the quick examples 

# Example 1: Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, id_vars = 'Student Names', value_vars = ['Courses', 'Fee') 

# Example 2: Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, 
   id_vars = 'Student Names', 
   value_vars = ['Courses', 'Fee'], 
   var_name = 'Course details', 
   value_name = 'Attributes')

# Example 3: Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, 
   id_vars = 'Student Names', 
   value_vars = ['Courses', 'Fee'], 
   var_name = 'Course details', 
   value_name = 'Attributes',
   ignore_index = False)

2. Syntax of the melt()

Following is the syntax of the melt() function.


# Syntax of melt()
pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None, ignore_index=True)

2.1 Parameters of the melt() Function

Following are the parameters of the melt() function.

  • id_vars : (tuple, list, or ndarray, optional) Using this param we can set single or multiple columns that, will use as a identifiers of the new format.
  • value_vars : (tuple, list, or ndarray, optional) Using this param we can set the columns that are used to unpivot. If we not provide this parameter Pandas will use all remaining columns as a value_vars except for those specified in id_vars.
  • var_name : It defines column name for the variable column.
  • value_name : It defines column name for the value column.
  • col_level : If we have multi index columns then, use this level to melt.
  • ignore_index : Accepts a boolean, if True, original index is ignored. If False, the original index is retained. Index labels will be repeated as necessary.

2.2 Return Value of melt()

It returns unpivoted pandas DataFrame.

3. Unpivot Pandas DataFrame Using melt()

Pandas melt() function is used to change the shape of the given DataFrame. In which format is changed from wide to long, it means that one or more columns are used as identifiers and the remaining columns are used as values. Finally, this process makes the DataFrame unpivot.

Python pandas is widely used for data science/data analysis and machine learning applications. It is built on top of another popular package named Numpy, which provides scientific computing in Python. pandas DataFrame is a 2-dimensional labeled data structure with rows and columns (columns of potentially different types like integers, strings, float, None, Python objects e.t.c). You can think of it as an excel spreadsheet or SQL table.

First, let’s create Pandas DataFrame from Python dictionary.


import numpy as np
import pandas as pd
df = pd.DataFrame({'Student Names' : ['Jenny', 'Singh', 'Charles', 'Richard', 'Veena'],
                  'Courses': ['Java', 'Spark', 'PySpark','Hadoop','C'],
                   'Fee': [15000, 17000, 27000, 29000, 12000],
                   'Discount': [1100, 800, 1000, 1600, 600]})
print(df)

Yields below output.


# Output:
  Student Names  Courses    Fee  Discount
0         Jenny     Java  15000      1100
1         Singh    Spark  17000       800
2       Charles  PySpark  27000      1000
3       Richard   Hadoop  29000      1600
4         Veena        C  12000       600

Let’s unpivot the data using the Pandas melt() function, for that we need to pass id_vars and value_vars, it will return the long format unpivot DataFrame.


# Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, id_vars = 'Student Names', value_vars = ['Courses', 'Fee') 
print(un_pivot)

Yields below output.


# Output:
   Student Names variable    value
0         Jenny  Courses     Java
1         Singh  Courses    Spark
2       Charles  Courses  PySpark
3       Richard  Courses   Hadoop
4         Veena  Courses        C
5         Jenny      Fee    15000
6         Singh      Fee    17000
7       Charles      Fee    27000
8       Richard      Fee    29000
9         Veena      Fee    12000

Here, 'variable' and 'value‘ are default values of var_name and value_name parameters respectively.

4. Unpivot Pandas DataFrame using var_name & value_name

We can also customize var_name & value_name these parameters with specified names in a melt(), it will return the customized var_name and value_name of unpivot DataFrame.


# Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, 
   id_vars = 'Student Names', 
   value_vars = ['Courses', 'Fee'], 
   var_name = 'Course details', 
   value_name = 'Attributes'
)
print(un_pivot)

Yields below output.


# Output:
 Student Names Course details Attributes
0         Jenny        Courses       Java
1         Singh        Courses      Spark
2       Charles        Courses    PySpark
3       Richard        Courses     Hadoop
4         Veena        Courses          C
5         Jenny            Fee      15000
6         Singh            Fee      17000
7       Charles            Fee      27000
8       Richard            Fee      29000
9         Veena            Fee      12000

5. Unpivot Pandas DataFrame using ignore_index

We can set the False as an ignore_index and pass it into the melt() function, it will return the unpivot DataFrame with the original index. Default ignore_index is True.


# Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, 
   id_vars = 'Student Names', 
   value_vars = ['Courses', 'Fee'], 
   var_name = 'Course details', 
   value_name = 'Attributes',
   ignore_index = False
)
print(un_pivot)

Yields below output.


# Output:
 Student Names Course details Attributes
0         Jenny        Courses       Java
1         Singh        Courses      Spark
2       Charles        Courses    PySpark
3       Richard        Courses     Hadoop
4         Veena        Courses          C
0         Jenny            Fee      15000
1         Singh            Fee      17000
2       Charles            Fee      27000
3       Richard            Fee      29000
4         Veena            Fee      12000

6. Conclusion

In this article, I have explained how to unpivot Pandas DataFrame using the melt() function by using different parameters of the melt() function with examples.

References

Naveen Nelamali

Naveen Nelamali (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ LinkedIn and Medium