Site icon Spark By {Examples}

How to use Pandas unstack() Function

Pandas unstack() function

In Pandas, the unstack() function is used to pivot a level of the multi-index labels along the index axis (rows) to the column axis. By default, it pivots the innermost row index level into column labels, but you can specify other levels as well if needed. Reshaping the DataFrame is one of the crucial techniques in data analysis, and Pandas provides a rich set of in-built functions for this purpose.

In this article, I will explain the pandas unstack() function and using its syntax, parameters, and usage how can we transpose single or multi-level row indices into column levels with example.

Key Points –

Quick Examples of unstack() Function

If you are in a hurry, below are some quick examples of unstack() functions.


# Quick examples of unstack() function

# Example 1: Unstack default level(-1) of pandas Series
print("Unstacked DataFrame :\n", ser.unstack())

# Example 2: Unstack specified level
print("Unstacked DataFrame :\n", ser.unstack(level = 0))

# Example 3: Unstack default level(-1) of pandas DataFrame
print("Unstacked DataFrame :\n", df.unstack())

# Example 4: Unstack specified level
print("Unstacked DataFrame :\n", df.unstack(level = 0))

# Example 5: Fill NaN value set fill_value
print("Unstacked DataFrame :\n", df.unstack(level = 0, fill_value = '-'))

Pandas unstack() Introduction

Following is the syntax of unstack() function.


# Syntax of Pandas unstack()
DataFrame.unstack(level=- 1, fill_value=None)

Parameters of the unstack()

Following are the parameters of the pandas unstack() function.

Return Value

It returns an unstacked DataFrame.

Usage of unstack() Function

In pandas, the unstack() function is used to reshape a DataFrame by converting one or more levels of the row index into column labels. It essentially transposes the DataFrame, turning rows into columns. This function is the reverse operation of the stack() function, where data is stacked from the column level to the row level.

Let’s create Pandas Series with a multi-level index and apply the Pandas unstack() function, it will return the unstacked Pandas DataFrame. Set the index of multiple level using pandas.MultiIndex.from_tuples().


# Create multi index Pandas Series
import pandas as pd
index = pd.MultiIndex.from_tuples([
  ('Seattle', 'Date'), 
  ('Seattle', 'Temp'),
  ('Sanfrancesco', 'Date'), 
  ('Sanfrancesco', 'Temp')
])
ser = pd.Series(["30-12-2010", 40.7, "31-12-2010",  40.5], index=index)
print(ser)

Yields below output.

Pandas unstack
Pandas DataFrame

Unstack Panda Series

Apply unstack() function on a multi-indexed Pandas Series, by default it unstacked innermost row level to column level. It returns the unstacked DataFrame where the column labels are the innermost row indexes of the original Series.


# Unstack default level(-1) of pandas Series
print("Unstacked DataFrame :\n", ser.unstack())
Pandas unstack
Unstacked DataFrame

Unstack Specified Level of Pandas Series

When you don’t specify a level, unstack() by default unstacks the innermost row level onto a column level, which is equivalent to setting level=-1. However, if you want to unstack a specific row level (or levels) to the column level, you can specify the level(s) using the level parameter.


# Unstack specified level
print("Unstacked DataFrame :\n", ser.unstack(level = 0))

# Output: 
# Unstacked DataFrame :
#       Sanfrancesco     Seattle
# Date   31-12-2010  30-12-2010
# Temp         40.5        40.7

In the above example, by setting level=0, we’re unstacking the first level of the multi-index (‘Seattle’ and ‘Sanfrancisco’) onto column level, resulting in a DataFrame where these cities become column labels.

Unstack Pandas DataFrame

Apply unstack() function on a multi-indexed Pandas DataFrame. It returns the unstacked DataFrame where the column labels are the innermost row indexes of original DataFrame.

Let’s create Pandas DataFrame with multi-level index,


# Unstack default level(-1) of pandas DataFrame
multi_index = pd.MultiIndex.from_tuples([("Index1","Seattle"), 
                                           ("Index1","Sanfrancesco"),
                                           ("Index2","Newyork"),
                                           ("Index2","Washington")])
df = pd.DataFrame({"Date":["30-12-2010", 
                            "30-12-2010", 
                            "30-12-2010", 
                            "30-12-2010"],"Temp":[40.2, 40.5, 41.4, 42.1]}, index=multi_index)

print(df)
print("Unstacked DataFrame :\n", df.unstack())

# Output:
#                           Date  Temp
# Index1 Seattle       30-12-2010  40.2
#       Sanfrancesco  30-12-2010  40.5
# Index2 Newyork       30-12-2010  41.4
#       Washington     30-12-2010  42.1
#

# Unstacked DataFrame :
#               Date                           ...         Temp                  
#           Newyork Sanfrancesco     Seattle  ... Sanfrancesco Seattle Washington
# Index1         NaN   30-12-2010  30-12-2010  ...         40.5    40.2       NaN
# Index2  30-12-2010          NaN         NaN  ...          NaN     NaN      42.1
#
# [2 rows x 8 columns]

Unstack Specified Level of Pandas DataFrame

The below example will unstack the specified row level to the column level. For example,


# Unstack specified level
print("Unstacked DataFrame :\n", df.unstack(level = 0))

# Output:
#                      Date               Temp       
#                  Index1      Index2 Index1 Index2
# NewYork              NaN  30-12-2010    NaN   41.4
# Sanfrancesco  30-12-2010         NaN   40.5    NaN
# Seattle       30-12-2010         NaN   40.2    NaN
# Washington            NaN  30-12-2010    NaN   42.1

Fill NaN Values with Fill_value Param

Use fill_value param with specified value into unstack() function to replace NaN value with a specific value.


# Fill NaN value set fill_value
print("Unstacked DataFrame :\n", df.unstack(level = 0, fill_value = '-'))

# Output:
# Unstacked DataFrame:
#                      Date               Temp       
#                  Index1      Index2 Index1 Index2
# Newyork                -  30-12-2010      -   41.4
# Sanfrancesco  30-12-2010           -   40.5      -
# Seattle       30-12-2010           -   40.2      -
# Washington              -  30-12-2010      -   42.1

Frequently Asked Questions on Pandas unstack() Function

What does the Pandas unstack() function do?

The unstack() function in Pandas reshapes a DataFrame by pivoting the innermost level of the hierarchical index, converting it into columns. This allows for easier manipulation and analysis of multi-level indexed data.

When should I use the unstack() function?

You should use unstack() when you have hierarchical or multi-level indexed data and need to convert it into a more traditional two-dimensional DataFrame structure for easier analysis or visualization.

Can unstack() be applied to both Series and DataFrames?

The unstack() function can be applied to both Series and DataFrames in Pandas. It operates on the index of the object, so it’s applicable whenever you have hierarchical indexes.

How does unstack() handle missing data?

By default, unstack() will introduce NaN values for any combinations of index levels that are not present in the original DataFrame. However, you can specify how to handle missing data using the fill_value parameter to replace NaNs with a specified value.

Does unstack() alter the original DataFrame?

By default, unstack() will introduce NaN values for any combinations of index levels that are not present in the original DataFrame. However, you can specify how to handle missing data using the fill_value parameter to replace NaNs with a specified value.

Conclusion

In this article, you have learned the Pandas unstack() function and using its syntax and parameters, and usage how we can transpose row level to column level in a Series/DataFrame.

Happy learning!!

References

Exit mobile version