• Post author:
  • Post category:PySpark
  • Post last modified:October 16, 2025
  • Reading time:12 mins read

Posexplode_outer() in PySpark is a powerful function designed to explode or flatten array or map columns into multiple rows while retaining the position (index) of each element. Unlike posexplode(), which skips rows with null or empty arrays/maps, posexplode_outer() produces rows even when the array or map is null or empty by returning (null, null) for position and element columns.

Advertisements

In this article, I will explain the PySpark posexplode_outer() function, including its syntax, parameters, and practical usage. You’ll learn how to use it to explode array or map columns in a DataFrame into multiple rows while retaining the position (index) of each element. Additionally, I’ll show how it returns null values for rows where the array or map columns are null or empty.

Key Points-

  • It returns a new row for each element in an array or each key-value pair in a map column.
  • It includes an additional column for the position of each element within the array or map.
  • If the array or map column is null or empty, it produces a row with (null, null) values instead of dropping the row.
  • By default, the resulting columns are named pos for the position and col for the element in an array; for a map, it produces poskey, and value columns.
  • This function is a combination of functionalities of explode_outer() (which retains null/empty array rows) and posexplode() (which adds element positions).

PySpark posexplode_outer() Function

The PySpark posexplode_outer() function operates similarly to posexplode(), generating a new row for each element in an array or map along with its position. The key difference is that posexplode_outer() retains rows where the array or map is null or empty by generating rows with null positions and values, whereas posexplode() would skip such rows.

Syntax

Following is the syntax of the poexplode_outer() function.


# Syntax of the posexplode_outer()
from pyspark.sql.functions import posexplode
posexplode_outer(col)

Parameters

  • col: The name or expression of the column containing an array or map.

Return Value

  • Returns new columns with each row representing an element of an array or a key-value pair from a map, along with its position.
  • For null or empty arrays/maps, produces a row with null values.

Let’s start with a sample DataFrame containing arrays and maps.


# Create SparkSession and Prepare sample Data
from pyspark.sql import SparkSession
from pyspark.sql.functions import explode, col

spark = SparkSession.builder.appName('pyspark-by-examples').getOrCreate()

arrayData = [
    ('James', ['Java', 'Scala'], {'hair': 'black', 'eye': 'brown'}),
    ('Michael', ['Spark', 'Java', None], {'hair': 'brown', 'eye': None}),
    ('Robert', ['CSharp', ''], {'hair': 'red', 'eye': ''}),
    ('Washington', None, None),
    ('Jefferson', ['1', '2'], {})
]

df = spark.createDataFrame(data=arrayData, schema=['name', 'knownLanguages', 'properties'])
df.printSchema()
df.show(truncate=False)

Yields below the output.

PySaprk posexplode_outer()

Using PySpark posexplode_outer() on Array Column

You can apply posexplode_outer() to an array column to create a new row for each element. Unlike posexplode(), posexplode_outer() retains rows where arrays are null or empty by producing rows with null positions and values.


# Using posexplode_outer() on Array Column
from pyspark.sql.functions import posexplode_outer

df_outer = df.select(df.name, posexplode_outer(df.knownLanguages))
df_outer.show(truncate=False)

Yields below the output.

PySaprk posexplode_outer()

As shown, rows with null or empty arrays (like “Washington”) are retained with (null, null) values.

Using PySpark posexplode_outer() on Map Column

posexplode_outer() on a map column generates rows with position, key, and value. Null or empty maps generate a single row with nulls.


# Posexplode_outer() map column
df_outer_map = df.select(df.name, posexplode_outer(df.properties).alias("pos", "key", "value"))
df_outer_map.show(truncate=False)

Yields below the output.


# Output:
+----------+----+----+-----+
|name      |pos |key |value|
+----------+----+----+-----+
|James     |0   |eye |brown|
|James     |1   |hair|black|
|Michael   |0   |eye |NULL |
|Michael   |1   |hair|brown|
|Robert    |0   |eye |     |
|Robert    |1   |hair|red  |
|Washington|NULL|NULL|NULL |
|Jefferson |NULL|NULL|NULL |
+----------+----+----+-----+

Even for null or empty maps, posexplode_outer() generates a row with nulls.

Compare posexplode_outer() vs posexplode()

The table below highlights the key differences between posexplode()_outer() and posexplode() in PySpark:

Featureposexplode()posexplode_outer()
Handles null/empty arraysSkips rowsRetains rows with null outputs
Output columns for arrayspos, colpos, col
Output columns for mapspos, key, valuepos, key, value
Useful when order mattersYesYes
Preserves null/empty rowsNoYes

Frequently Asked Questions of PySpark posexplode_outer()

What is the difference between posexplode() and posexplode_outer()?

posexplode() skips null or empty arrays/maps, while posexplode_outer() includes them with (null, null) placeholders.

When should I use posexplode_outer()?

Use it when you want to preserve all rows, even if some arrays or maps are missing or empty.

Does posexplode_outer() work with both arrays and maps?

Yes, it supports both arrays return pos and col; maps return pos, key, and value.

What is the main advantage of posexplode_outer()?

It prevents data loss by retaining rows that would otherwise be dropped during explosion.

Conclusion

In this article, I explained the posexplode_outer() function in PySpark with examples using arrays and maps.
You learned how it differs from posexplode(), when to use it, and how it helps retain null or empty array/map rows.

Use posexplode_outer() when working with nested data structures where preserving all rows, including nulls, is important.

Happy Learning!!

Reference

https://spark.apache.org/docs/latest/api/python/reference/pyspark.sql/api/pyspark.sql.functions.posexplode_outer.html