Site icon Spark By {Examples}

Spark Get the Current SparkContext Settings

Spark get current sparkContext settings

In Spark/PySpark you can get the current active SparkContext and its configuration settings by accessing spark.sparkContext.getConf.getAll(), here spark is an object of SparkSession and getAll() returns Array[(String, String)], let’s see with examples using Spark with Scala & PySpark (Spark with Python).

Spark Get SparkContext Configurations

In the below Spark example, I have added additional configuration to Spark using SparkConf and retrieve all default config values from SparkContext along with the one I added.


val config = new SparkConf()
config.set("spark.sql.shuffle.partitions","300")
val spark=SparkSession.builder().config(config).master("local[3]")
    .appName("SparkByExamples.com")
    .getOrCreate();
val arrayConfig=spark.sparkContext.getConf.getAll
for (conf <- arrayConfig)
    println(conf._1 +", "+ conf._2)

Yields below output.


spark.app.name, SparkByExamples.com
spark.app.id, local-1618196887324
spark.driver.host, DELL-ESUHAO2KAJ
spark.master, local[3]
spark.executor.id, driver
spark.driver.port, 52984

Use get() method of SparkConf to get the value for a specific configuration.


print("spark.sql.shuffle.partitions ==> "+spark.sparkContext.getConf.get("spark.sql.shuffle.partitions"))
// Display below value
// spark.sql.shuffle.partitions ==> 300

PySpark Get SparkContext Configurations

similarly let’s see how to get the current PySpark SparkContext setting configurations.


from pyspark.sql import SparkSession
spark = SparkSession.builder.appName('SparkByExamples.com').getOrCreate()

configurations = spark.sparkContext.getConf().getAll()
for item in configurations: print(item)

This prints the below configuration. Alternatively, you can also get the PySpark configurations using spark.sparkContext._conf.getAll()


('spark.app.name', 'SparkByExamples.com')
('spark.rdd.compress', 'True')
('spark.driver.host', 'DELL-ESUHAO2KAJ')
('spark.serializer.objectStreamReset', '100')
('spark.submit.pyFiles', '')
('spark.executor.id', 'driver')
('spark.submit.deployMode', 'client')
('spark.app.id', 'local-1617974806929')
('spark.ui.showConsoleProgress', 'true')
('spark.master', 'local[1]')
('spark.driver.port', '65211')

If you wanted to get a specific configuration.


print(spark.sparkContext.getConf().get("spark.driver.host"))

Conclusion

By using getAll() method of SparkConf you can get all current active Spark/PySpark SparkContext settings, you can also use get() method to get value for specific settings.

Happy Learning !!

References

Exit mobile version