The PySpark element_at() function is a collection function used to retrieve an element from an array at a specified index or a value from a map for a given key. For arrays, indexing starts at 1; using an index of 0 will result in an error. If the index is negative, elements are accessed from the end of the array towards the beginning.
When it comes to maps, element_at() returns the value associated with the specified key, and returns NULL if the key does not exist in the map. This function is part of the pyspark.sql.functions module and is especially useful when working with complex data types such as arrays and maps in DataFrames.
In this article, we’ll cover the syntax, parameters, return values, and practical examples of element_at() using a sample dataset.
Key Points-
element_at()retrieves elements from arrays or values from maps in PySpark DataFrames.- Works with both ArrayType and MapType columns.
- Array indexing starts at 1; using 0 will throw an error.
- Supports negative indexes to access elements from the end of an array.
- For map columns, returns the value associated with the specified key.
- Returns NULL if the index or key doesn’t exist, without raising an error.
- Keys for map lookup can be specified as literal strings or column references.
- Useful for working with nested or complex JSON-like data.
- Available under
pyspark.sql.functionsand requires explicit import. - Efficient for element retrieval without transforming the full array or map.
PySpark element_at()
The element_at() function in PySpark is used to extract a specific element from an array or a specific value from a map based on a given index or key. It supports both positive and negative indexing for arrays, and returns NULL if the specified index or key does not exist.
Syntax of PySpark element_at()
Following is the Syntax of PySpark element_at() function.
# Syntax of PySpark element_at()
pyspark.sql.functions.element_at(col, extraction)
Parameters
col:(Column or str) Name of the column having an array or a mapextraction<span style="box-sizing: border-box; margin: 0px; padding: 0px;"> </span>:A 1-based Index to check for in the array or a key to check for in the map
Return Value
- If
colis an array → Returns the element at the specified position. - If
colis a map → Returns the value for the given key. - Returns NULL if the position or key does not exist.
Create DataFrame
Let’s create a sample DataFrame and implement this function in multiple ways.
# Create DataFrame
from pyspark.sql import SparkSession
# Create SparkSession
spark = SparkSession.builder.appName('pyspark-by-examples').getOrCreate()
# Sample Data
arrayData = [
('James',['Java','Scala'],{'hair':'black','eye':'brown'}),
('Michael',['Spark','Java',None],{'hair':'brown','eye':None}),
('Robert',['CSharp',''],{'hair':'red','eye':''}),
('Washington',None,None),
('Jefferson',['1','2'],{})
]
# Create DataFrame
df = spark.createDataFrame(data=arrayData, schema=['name','knownLanguages','properties'])
df.printSchema()
df.show(truncate=False)
Yields below the output.

Get the First Element of an Array
You can use the element_at() function to get the first element of an array by specifying its index. Simply pass the array column along with the desired index to the function, and it will return the first element of the array for each row.
# Get the First Element of an Array
from pyspark.sql.functions import element_at
df.select(
"name",
element_at("knownLanguages", 1)
).show(truncate=False)
Yields below the output.

Get the Last Element of an Array using Negative Index
To get the last element of an array, use a negative index. For example, passing -1 retrieves the last element in the array for each row. Negative indexes count from the end towards the beginning of the array.
# Get Last Element of an Array using Negative Index
df.select(
"name",
element_at("knownLanguages", -1)
).show(truncate=False)
Yields below the output.
# Output:
+----------+------------------------------+
|name |element_at(knownLanguages, -1)|
+----------+------------------------------+
|James |Scala |
|Michael |NULL |
|Robert | |
|Washington|NULL |
|Jefferson |2 |
+----------+------------------------------+
Get a Value from a Map using a Key
You can also use element_at() to retrieve the value associated with a specific key in a map column. Pass the map column and the specified key as arguments, and the function will return the matching value for each row.
# Get a Value from a Map using a Key
df.select(
"name",
element_at("properties", "eye")
).show(truncate=False)
Yields below the output.
# Output:
+----------+---------------------------+
|name |element_at(properties, eye)|
+----------+---------------------------+
|James |brown |
|Michael |NULL |
|Robert | |
|Washington|NULL |
|Jefferson |NULL |
+----------+---------------------------+
Get a Non-Existing Value from a Map using a Key
If you want to retrieve a value for a key that doesn’t exist in the map, element_at() function will return NULL instead of throwing an error. This makes it a safe choice for working with incomplete or inconsistent data.
# Get a Non-Existing Value from a Map using a Key
df.select(
"name",
element_at("properties", "height")
).show(truncate=False)
Yields below the output.
# Output:
+----------+------------------------------+
|name |element_at(properties, height)|
+----------+------------------------------+
|James |NULL |
|Michael |NULL |
|Robert |NULL |
|Washington|NULL |
|Jefferson |NULL |
+----------+------------------------------+
PySpark Element in Array and Map Together
You can also implement both arrays and maps at a time by the same query using element_at() function. This allows you to fetch related information from multiple complex columns in one go.
# Implement both array and map columns uisng element_in()
df.select(
"name",
element_at("knownLanguages", 1),
element_at("properties", "hair")
).show(truncate=False)
Yields below the output.
# Output
+----------+-----------------------------+----------------------------+
|name |element_at(knownLanguages, 1)|element_at(properties, hair)|
+----------+-----------------------------+----------------------------+
|James |Java |black |
|Michael |Spark |brown |
|Robert |CSharp |red |
|Washington|NULL |NULL |
|Jefferson |1 |NULL |
+----------+-----------------------------+----------------------------+
PySpark element_at() vs getitem
Use both element_at() and getitem() to retrieve elements, but there are differences:
element_at()uses 1-based indexing for arrays and supports negative indexes.getitem()uses 0-based indexing and does not support negative indexes.- When working with maps, both can be used, but
element_at()is often preferred for its flexibility.
# PySpark element_at() vs getitem
from pyspark.sql.functions import col
df.select(
"name",
element_at("knownLanguages", 2),
col("knownLanguages").getItem(1)
).show(truncate=False)
Yields below the output.
+----------+-----------------------------+-----------------+
|name |element_at(knownLanguages, 2)|knownLanguages[1]|
+----------+-----------------------------+-----------------+
|James |Scala |Scala |
|Michael |Java |Java |
|Robert | | |
|Washington|NULL |NULL |
|Jefferson |2 |2 |
+----------+-----------------------------+-----------------+
PySpark Get Element At
element_at() also works with dynamic indexes or keys passed as column expressions or literal values. For example, you can use lit(-1) to dynamically retrieve the last element of an array.
# Retrieve the last element of an array using lit()
from pyspark.sql.functions import lit
df.select(
"name",
element_at("knownLanguages", lit(-1))
).show(truncate=False)
Yields below the output.
# Output
+----------+------------------------------+
|name |element_at(knownLanguages, -1)|
+----------+------------------------------+
|James |Scala |
|Michael |NULL |
|Robert | |
|Washington|NULL |
|Jefferson |2 |
+----------+------------------------------+
Frequently Asked Questions of PySpark element_at()
element_at() retrieves an element from an array at a specified index or a value from a map for a given key.
For arrays, element_at() uses 1-based indexing, which means the first element is accessed with index 1, and using index 0 will result in an error in PySpark
You can use the negative index to retrieve elements from the end of the array (e.g., -1 gets the last element).
If the array index is greater than the array size or less than -len(array), element_at() returns NULL instead of throwing an error.
When used on a map, element_at(col, key) returns the value associated with the given key. If the key does not exist, it returns NULL.
element_at() uses 1-based indexing for arrays and supports negative indexes.getitem() uses 0-based indexing and does not support negative indexes.
Both can be used with maps.
Conclusion
In this article, I explained multiple ways to work with array and map columns in a DataFrame using the PySpark element_at() function. This versatile, NULL-safe function allows you to retrieve elements from both arrays and maps. It supports 1-based indexing for arrays, negative indexes for accessing elements from the end, and gracefully returns NULL for missing indexes or keys without raising an error.
Happy Learning!!
Reference
Related Articles
- PySpark – explode nested array into rows
- PySpark Convert Dictionary/Map to Multiple Columns
- PySpark ArrayType Column With Examples
- PySpark map() Transformation
- PySpark array_contains() function with examples.
- PySpark MapType (Dict) Usage with Examples