Spark Set JVM Options to Driver & Executors

How to submit JVM options to Driver and Executors while submitting Spark or PySpark applications via spark-submit.

You can set the JVM options to driver and executors by using spark.driver.extraJavaOptions and spark.executor.extraJavaOptions respectively when using spark-submit

Related: How to set Environment Variables to Executors

By Using Spark Submit

Regardless you are using Spark with Scala or PySpark you can use the extraJavaOptions to set JVM options to driver and executors.


spark-submit --master yarn \
    --deploy-mode cluster \
    --name my-app \
    --conf 'spark.executor.extraJavaOptions=-DenvVar1=var1Value -DenvVar2=var2Value' \
    --conf 'spark.driver.extraJavaOptions=-DenvVar1=var1Value -DenvVar2=var2Value'
    ........
    ........

Using SparkConf

You can also set the JVM options to driver and executors at the time of creating SparkSession. The below example demonstrates with Scala, similarly you can also achieve this in PySpark


import org.apache.spark.sql.SparkSession

// Create SparkSession in spark 2.x or later
val spark = SparkSession.builder().master("local[*]")
    .appName("SparkByExamples.com")
    .conf("spark.driver.extraJavaOptions","-DenvVar1=var1Value")
    .conf("spark.executor.extraJavaOptions","-DenvVar1=var1Value")
    .getOrCreate()

Note that when you submit your Spark or PySpark application in client mode, the spark driver runs on the server where you submit your application.

Note: In client mode, spark.driver.extraJavaOptions config must not be set through the SparkConf (using .conf()) directly in your application, because the driver JVM has already started at that point. Instead, please set this through the --driver-java-options command line option or in your default properties file.


spark-submit --master yarn \
    --deploy-mode client \
    --name my-app \
    --driver-java-options "-DenvVar1=var1Value"
    --conf 'spark.executor.extraJavaOptions=-DenvVar1=var1Value' \
    ........
    ........

Conclusion

In this article, you have learned how to submit JVM arguments or options to Spark or PySpark Driver and Executors by using spark.driver.extraJavaOptions and spark.executor.extraJavaOptions respectively. 

Related Articles

Naveen (NNK)

Naveen (NNK) is a Data Engineer with 20+ years of experience in transforming data into actionable insights. Over the years, He has honed his expertise in designing, implementing, and maintaining data pipelines with frameworks like Apache Spark, PySpark, Pandas, R, Hive and Machine Learning. Naveen journey in the field of data engineering has been a continuous learning, innovation, and a strong commitment to data integrity. In this blog, he shares his experiences with the data as he come across. Follow Naveen @ @ LinkedIn

Leave a Reply