In pandas we can unpivot
a given DataFrame columns by using melt() method, where the format of the data changes from a wide format to a long format. Using melt() function parameters we can reduce the columns and increase the rows of the given DataFrame, this will give a long format of the DataFrame. In this article, I will explain how to unpivot the DataFrame in pandas using the melt() method and its syntax and parameters with examples.
1. Quick Examples of Pandas Unpivot the DataFrame
If you are in hurry below are some quick examples of unpivot the DataFrame in pandas.
# Below are the quick examples
# Example 1: Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, id_vars = 'Student Names', value_vars = ['Courses', 'Fee')
# Example 2: Unpivot the DataFrame using melt()
un_pivot = pd.melt(df,
id_vars = 'Student Names',
value_vars = ['Courses', 'Fee'],
var_name = 'Course details',
value_name = 'Attributes')
# Example 3: Unpivot the DataFrame using melt()
un_pivot = pd.melt(df,
id_vars = 'Student Names',
value_vars = ['Courses', 'Fee'],
var_name = 'Course details',
value_name = 'Attributes',
ignore_index = False)
2. Syntax of the melt()
Following is the syntax of the melt()
function.
# Syntax of melt()
pandas.melt(frame, id_vars=None, value_vars=None, var_name=None, value_name='value', col_level=None, ignore_index=True)
2.1 Parameters of the melt() Function
Following are the parameters of the melt() function.
id_vars
: (tuple, list, or ndarray, optional) Using this param we can set single or multiple columns that, will use as a identifiers of the new format.value_vars
: (tuple, list, or ndarray, optional) Using this param we can set the columns that are used to unpivot. If we not provide this parameter Pandas will use all remaining columns as avalue_vars
except for those specified inid_vars
.var_name
: It defines column name for the variable column.value_name
: It defines column name for the value column.col_level
: If we have multi index columns then, use this level to melt.ignore_index
: Accepts a boolean, if True, original index is ignored. If False, the original index is retained. Index labels will be repeated as necessary.
2.2 Return Value of melt()
It returns unpivoted pandas DataFrame.
3. Unpivot Pandas DataFrame Using melt()
Pandas melt() function is used to change the shape of the given DataFrame. In which format is changed from wide to long, it means that one or more columns are used as identifiers and the remaining columns are used as values. Finally, this process makes the DataFrame unpivot.
Python pandas is widely used for data science/data analysis and machine learning applications. It is built on top of another popular package named Numpy, which provides scientific computing in Python. pandas DataFrame is a 2-dimensional labeled data structure with rows and columns (columns of potentially different types like integers, strings, float, None, Python objects e.t.c). You can think of it as an excel spreadsheet or SQL table.
First, let’s create Pandas DataFrame from Python dictionary.
import numpy as np
import pandas as pd
df = pd.DataFrame({'Student Names' : ['Jenny', 'Singh', 'Charles', 'Richard', 'Veena'],
'Courses': ['Java', 'Spark', 'PySpark','Hadoop','C'],
'Fee': [15000, 17000, 27000, 29000, 12000],
'Discount': [1100, 800, 1000, 1600, 600]})
print(df)
Yields below output.
# Output:
Student Names Courses Fee Discount
0 Jenny Java 15000 1100
1 Singh Spark 17000 800
2 Charles PySpark 27000 1000
3 Richard Hadoop 29000 1600
4 Veena C 12000 600
Let’s unpivot the data using the Pandas melt() function, for that we need to pass id_vars
and value_vars
, it will return the long format unpivot DataFrame.
# Unpivot the DataFrame using melt()
un_pivot = pd.melt(df, id_vars = 'Student Names', value_vars = ['Courses', 'Fee')
print(un_pivot)
Yields below output.
# Output:
Student Names variable value
0 Jenny Courses Java
1 Singh Courses Spark
2 Charles Courses PySpark
3 Richard Courses Hadoop
4 Veena Courses C
5 Jenny Fee 15000
6 Singh Fee 17000
7 Charles Fee 27000
8 Richard Fee 29000
9 Veena Fee 12000
Here, 'variable'
and 'value
‘ are default values of var_name
and value_name
parameters respectively.
4. Unpivot Pandas DataFrame using var_name & value_name
We can also customize var_name & value_name
these parameters with specified names in a melt(), it will return the customized var_name and value_name of unpivot DataFrame.
# Unpivot the DataFrame using melt()
un_pivot = pd.melt(df,
id_vars = 'Student Names',
value_vars = ['Courses', 'Fee'],
var_name = 'Course details',
value_name = 'Attributes'
)
print(un_pivot)
Yields below output.
# Output:
Student Names Course details Attributes
0 Jenny Courses Java
1 Singh Courses Spark
2 Charles Courses PySpark
3 Richard Courses Hadoop
4 Veena Courses C
5 Jenny Fee 15000
6 Singh Fee 17000
7 Charles Fee 27000
8 Richard Fee 29000
9 Veena Fee 12000
5. Unpivot Pandas DataFrame using ignore_index
We can set the False
as an ignore_index
and pass it into the melt() function, it will return the unpivot DataFrame with the original index. Default ignore_index is True.
# Unpivot the DataFrame using melt()
un_pivot = pd.melt(df,
id_vars = 'Student Names',
value_vars = ['Courses', 'Fee'],
var_name = 'Course details',
value_name = 'Attributes',
ignore_index = False
)
print(un_pivot)
Yields below output.
# Output:
Student Names Course details Attributes
0 Jenny Courses Java
1 Singh Courses Spark
2 Charles Courses PySpark
3 Richard Courses Hadoop
4 Veena Courses C
0 Jenny Fee 15000
1 Singh Fee 17000
2 Charles Fee 27000
3 Richard Fee 29000
4 Veena Fee 12000
6. Conclusion
In this article, I have explained how to unpivot Pandas DataFrame using the melt() function by using different parameters of the melt() function with examples.
Related Articles
- How to Sort pandas Series
- Drop Rows From Pandas DataFrame Examples
- Drop Single & Multiple Columns From Pandas DataFrame
- Change the Order of Pandas DataFrame Columns
- Pandas groupby() and sum() With Examples
- Difference Between loc and iloc in Pandas DataFrame
- How to Transpose() DataFrame in Pandas?
- How to Create Pandas Pivot Table Count
- How to Create Pandas Pivot Multiple Columns