Tune Spark Executor Number, Cores, and Memory

How to tune Spark’s number of executors, executor core, and executor memory to improve the performance of the job? In Apache Spark, the number of cores and the number of executors are two important configuration parameters that can significantly impact the resource utilization and performance of your Spark application.

1. Spark Executor

An executor is a Spark process responsible for executing tasks on a specific node in the cluster. Each executor is assigned a fixed number of cores and a certain amount of memory. The number of executors determines the level of parallelism at which Spark can process data.

Generally,

Having more executors allows for better parallelism and resource utilization.
Each executor can work on a subset of data independently, which can lead to increased processing speed.
However, it’s important to strike a balance between the number of executors and the available cluster resources. If the number of executors is too high, it can lead to excessive memory usage and increased overhead due to task scheduling.

Advantages:

More executors provide increased parallelism and the ability to process data in parallel.
Each executor can work on a subset of data independently, leading to improved processing speed.
It allows for better resource utilization by distributing the workload across multiple executor processes.

Considerations:

Allocating too many executors can lead to excessive memory usage and increased overhead due to task scheduling.
Inefficient executor allocation can result in the underutilization of cluster resources.
The optimal number of executors depends on factors such as dataset size, computation complexity, and available cluster resources.

2. Spark Cores

The number of cores refers to the total number of processing units available on the machines in your Spark cluster. It represents the parallelism level at which Spark can execute tasks. Each core can handle one concurrent task.

Increasing the number of cores allows,

Spark to execute more tasks simultaneously, which can improve the overall throughput of your application.
However, adding too many cores can also introduce overhead due to task scheduling and inter-node communication, especially if the cluster resources are limited.
The optimal number of cores depends on factors such as the size of your dataset, the complexity of your computations, and the available cluster resources.

Advantages:

Increasing the number of cores allows for higher parallelism and the ability to execute more tasks simultaneously.
More cores can lead to improved throughput and faster processing of data.
It allows better utilization of available computational resources in the cluster.

Considerations:

Adding too many cores without sufficient resources can lead to resource contention and performance degradation.
Excessive parallelism can introduce overhead due to task scheduling and inter-node communication, impacting performance.
The optimal number of cores depends on the size of the dataset, the complexity of computations, and available cluster resources.

3. Configuring Spark Number of Executors and its Cores

Configuring the number of cores and executors in Apache Spark depends on several factors, including

The characteristics of your workload,
The available cluster resources, and
Specific requirements of your application.

While there is no one-size-fits-all approach, here are some general guidelines to help you configure these parameters effectively:

Number of executors:
The number of executors should be equal to the number of cores on each node in the cluster.
If there are more cores than nodes, then the number of executors should be equal to the number of nodes.

Memory per executor:
The amount of memory allocated to each executor should be based on the size of the data that will be processed by that executor.
It is important to leave some memory available for the operating system and other processes.
A good starting point is to allocate 1GB of memory per executor.

Number of partitions:
The number of partitions used for shuffle operations should be equal to the number of executors.

Let’s try to understand how to decide on the Spark number of executors and cores to be configured in a cluster. For our better understanding Let’s say you have a Spark cluster with 16 nodes, each having 8 cores and 32 GB of memory and your dataset size is relatively large, around 1 TB, and you’re running complex computations on it.

Note: For the above cluster configuration we have:

Available Resources:
- Total cores in the cluster = 16 nodes * 8 cores per node = 128 cores
- Total memory in the cluster = 16 nodes * 32 GB per node = 512 GB
Workload Characteristics: Large dataset size and complex computations suggest that you need a high level of parallelism to efficiently process the data. Let’s assume that you want to allocate 80% of the available resources to Spark.

Now let’s try to analyze the efficient way to decide Spark’s Number of Executors and Cores.

3.1. Tiny Executor Configuration

One way of configuring Spark Executor and its core is setting minimal configuration for the executors and incrementing it based on the application performance.

Executor Memory and Cores per Executor: Considering having 1 core per executor,
* Number of executors per node=8,
* Executor-memory=32/8=4GB
Calculating the Number of Executors: To calculate the number of executors, divide the available memory by the executor memory:
* Total memory available for Spark = 80% of 512 GB = 410 GB
* Number of executors = Total memory available for Spark / Executor memory = 410 GB / 4 GB ≈ 102 executors
* Number of executors per node = Total Number of Executors/ Number of Nodes = 102/16 ≈ 6 Executors/Node

So, in this example, you would configure Spark with 102 executors, each executor having 1 core and 4 GB of memory.

Pros of Spark Tiny Executor Configuration:

Resource Efficiency: Tiny executors consume less memory and fewer CPU cores compared to larger configurations.
Increased Task Isolation: With tiny executors, each task runs in a more isolated environment. This isolation can prevent interference between tasks, reducing the chances of resource contention and improving the stability of your Spark application.
Task Granularity: Tiny executor configurations can be beneficial if your workload consists of a large number of small tasks. With smaller executors, Spark can allocate resources more precisely, ensuring that each task receives sufficient resources without excessive overprovisioning.

Cons of Spark Tiny Executor Configuration:

Increased Overhead: Using tiny executors can introduce higher overhead due to the increased number of executor processes and task scheduling.
Limited Parallelism: Tiny executors have fewer cores, limiting the level of parallelism in your Spark application.
Potential Bottlenecks: In a tiny executor configuration, if a single task takes longer to execute than others, it can become a bottleneck for the entire application.
Memory Overhead: Although tiny executors consume less memory individually, the overhead of multiple executor processes can add up. This can lead to increased memory usage for managing the executor processes, potentially reducing the available memory for actual data processing.

3.2. Fat Executor Configuration

The other way of configuring Spark Executor and its core is setting the maximum utility configuration i.e. having only one Executor per node and optimizing it based on the application performance.

Executor Memory and Cores per Executor: Considering having 8 cores per executor,
* Number of executors per node= number of cores for a node/ number of cores for an executor = 8/8 = 1,
* Executor-memory=32/1= 32GB
Calculating the Number of Executors: To calculate the number of executors, divide the available memory by the executor memory:
* Total memory available for Spark = 80% of 512 GB = 410 GB
* Number of executors = Total memory available for Spark / Executor memory = 410 GB / 32 GB ≈ 12 executors
* Number of executors per node = Total Number of Executors/ Number of Nodes = 12/16 ≈ 1 Executors/Node

So, in this example, you would configure Spark with 16 executors, each executor having 8 core and 32 GB of memory.

Pros of Fat Executor Configuration:

Increased Parallelism: Fat executor configurations allocate more CPU cores and memory to each executor, resulting in improved processing speed and throughput.
Reduced Overhead: With fewer executor processes to manage, a fat executor configuration can reduce the overhead of task scheduling, inter-node communication, and executor coordination. This can lead to improved overall performance and resource utilization.
Enhanced Data Locality: Larger executor memory sizes can accommodate more data partitions in memory, reducing the need for data shuffling across the cluster.
Improved Performance for Complex Tasks:. By allocating more resources to each executor, you can efficiently handle complex computations and large-scale data processing.

Cons of Fat Executor Configuration:

Resource Overallocation: Using fat executors can result in overallocation of resources, especially if the cluster does not have sufficient memory or CPU cores.
Reduced Task Isolation: With larger executor configurations, tasks have fewer executor processes to run on. This can increase the chances of resource contention and interference between tasks, potentially impacting the stability and performance of your Spark application.
Longer Startup Times: Fat executor configurations require more resources and may have longer startup times compared to smaller configurations.
Difficulty in Resource Sharing: Fat executors may not be efficient when sharing resources with other applications or services running on the same cluster. It can limit the flexibility of resource allocation and hinder the ability to run multiple applications concurrently.

3.3 Balanced Executor Configuration

Spark founder Databricks after several trail and error testing the spark Executor and cores configuration, they recommends to have 2-5 cores per executor as the best initial efficient configuration for running the application smoothly.

Executor Memory and Cores per Executor: Considering having 3 cores per executor, Leaving 1 core per node for daemon processes
* Number of executors per node= (number of cores for a node – core for daemon process)/ number of cores for an executor = 7/3 ≈ 2,
* Executor-memory=Total memory per node/ number executors per node = 32/2= 16GB
Calculating the Number of Executors: To calculate the number of executors, divide the available memory by the executor memory:
* Total memory available for Spark = 80% of 512 GB = 410 GB
* Number of executors = Total memory available for Spark / Executor memory = 410 GB / 16 GB ≈ 32 executors
* Number of executors per node = Total Number of Executors/ Number of Nodes = 32/16 = 2 Executors/Node

So, in this example, you would configure Spark with 32 executors, each executor having 3 core and 16 GB of memory.

In practice, one size does not fit all. You need to keep tuning as per cluster configuration. But in general, the number of executor cores should be 2-5.

Pros of Balanced Executor Configuration:

Optimal Resource Utilization: A balanced executor configuration aims to evenly distribute resources across the cluster. This allows for efficient utilization of both CPU cores and memory, maximizing the overall performance of your Spark application.
Reasonable Parallelism: By allocating a moderate number of cores and memory to each executor, a balanced configuration strikes a balance between parallelism and resource efficiency. It can provide a good compromise between the high parallelism of small executors and the resource consumption of large executors.
Flexibility for Multiple Workloads: A balanced configuration allows for accommodating a variety of workloads. It can handle both small and large datasets, as well as diverse computational requirements, making it suitable for environments where multiple applications or different stages of data processing coexist.
Reduced Overhead: Compared to larger executor configurations, a balanced configuration typically involves fewer executor processes. This can reduce the overhead of task scheduling, inter-node communication, and executor coordination, leading to improved performance and lower resource consumption.

Cons of Balanced Executor Configuration:

Limited Scaling: A balanced executor configuration may not scale as effectively as configurations with a higher number of cores or executors. In scenarios where the workload or dataset size significantly increases, a balanced configuration may reach its limit, potentially leading to longer processing times or resource contention.
Trade-off in Task Isolation: While a balanced configuration can provide a reasonable level of task isolation, it may not offer the same level of isolation as smaller executor configurations. In cases where tasks have distinct resource requirements or strict isolation requirements, a balanced configuration may not be the most suitable choice.
Task Granularity: In situations where the workload consists of a large number of small tasks, a balanced executor configuration may not offer the same level of fine-grained task allocation as smaller executor configurations. This can lead to suboptimal resource allocation and potentially impact performance.
Complexity in Resource Management: Maintaining a balanced executor configuration across a dynamic cluster can be challenging. As the cluster size and resource availability change, it may require frequent adjustments to ensure the configuration remains balanced, which can add complexity to cluster management.

4. Between Tiny, Fat, and Balanced Executor configuration

In conclusion, the choice between tiny, fat, and balanced executor configurations in Apache Spark depends on the specific requirements of your workload and the available cluster resources. Here’s a summary of the considerations for each configuration:

Tiny Executor Configuration:

Pros: Resource efficiency, increased task isolation, better granularity for small tasks, and suitability for limited resources or a high number of concurrent tasks.
Cons: Increased overhead, limited parallelism, potential bottlenecks, and memory overhead.

Fat Executor Configuration:

Pros: Increased parallelism, reduced overhead, enhanced data locality, and improved performance for complex tasks.
Cons: Resource overallocation, reduced task isolation, longer startup times, and potential challenges in resource sharing.

Balanced Executor Configuration:

Pros: Optimal resource utilization, reasonable parallelism, flexibility for multiple workloads, and reduced overhead.
Cons: Limited scaling, trade-off in task isolation, potential task granularity issues, and complexity in resource management.

5. Conclusion

In conclusion, Spark’s number of executors and cores plays a crucial role in achieving optimal performance and resource utilization for your Spark application.

Finding the optimal configuration for the number of executors and cores involves considering the characteristics of your workload and the available cluster resources. It’s recommended to experiment, measure performance, and fine-tune the configuration based on actual results. Here are some key points to consider:

Remember that there is no one-size-fits-all configuration, and the optimal settings may vary based on your specific workload, data size, computational complexity, and cluster resources. It’s recommended to analyze the performance metrics, monitor resource utilization, and conduct benchmarking to fine-tune the number of executors and cores for your Spark application.

Table of contents