What is explode function

Spark SQL explode function is used to create or split an array or map DataFrame columns to rows. Spark defines several flavors of this function; explode_outer - to handle nulls and empty, posexplode - which explodes with a position of element and posexplode_outer - to handle nulls.

Difference between explode vs explode_outer

explode - creates a row for each element in the array or map column by ignoring null or empty values in array. whereas explode_outer returns all values in array or map including null or empty.

Difference between explode vs posexplode

explode - creates a row for each element in the array or map column. whereas posexplode creates a row for each element in the array and creates two columns 'pos' to hold the position of the array element and the 'col' to hold the actual array value. And, for the map, it creates 3 columns 'pos', 'key' and 'value'

Spark explode array and map columns to rows

| *** Please Subscribe for Ad Free & Premium Content ***

Post author:Naveen Nelamali
Post category:Apache Spark / Member / Spark SQL Functions
Post last modified:April 24, 2024
Reading time:15 mins read

You are currently viewing Spark explode array and map columns to rows

This content is for members only.
Join Now

Already a member? Log in here

Tags: spark transformations