Spark Executor

Spark Executor

In Apache Spark, some distributed agent is responsible for executing tasks, this agent is what we call Spark Executor.


 What is Spark Executor

  • Basically, we can sayExecutors in Spark are worker nodes.
  • Those help to process in charge of running individual tasks in a given Spark job.Moreover, we launch them at the start of a Spark application.
  • Then it typically runs for the entire lifetime of an application.
  • As soon as they have run the task, sends results to the driver.
  • Executors also provide in-memory storage for Spark RDDs that are cached by user programs through Block Manager. 
    In addition, for the complete lifespan of a spark application, it runs.
  • That infers the static allocation of Spark executor. However, we can also prefer for dynamic allocation.
  • Moreover, with the help of Heartbeat Sender Thread, it sends metrics and heartbeats.
  • One of the advantage we can have as many executors in Spark as data nodes.
  • Moreover also possible to have as many cores as you can get from the cluster.
  • The other way to describe Apache Spark Executor is either by their id, hostname, environment (as SparkEnv), or classpath.
  • The most important point to note is Executor backends exclusively manage Executor in Spar
  • Executors have two jobs. To begin with,
    • they run the assignments that make up the application and return results to the driver.
    • Second, they give in-memory stockpiling to RDDs that are stored by client programs.
  • At the point when SparkContext associate with a bunch chief, it obtains an Executor on hubs in the group.

    Agents are Spark forms that run calculations and store the information on the laborer hub. The last errands by SparkContext are exchanged to agents for their execution.

SparkContext links to a cluster manager and obtains an Executor on cluster nodes. Executors are processes of Spark that run calculations and save the data on the operative node. The last SparkContext tasks are relocated to executors and executed.



. Conditions to Create Spark Executor

Some conditions in which we create Executor in Spark is:

  • When CoarseGrainedExecutorBackend receives RegisteredExecutor message. Only for Spark Standalone and YARN.
  • While Mesos’s MesosExecutorBackend registered on Spark.
  • When LocalEndpoint is created for local mode.

. Creating Spark Executor Instance

By using the following, we can create the Spark Executor:

  • From Executor ID.
  • By using SparkEnv we can access the local MetricsSystem as well as BlockManager. Moreover, we can also access the local serializer by it.
  • From Executor’s hostname.
  • To add to tasks’ classpath, a collection of user-defined JARs. By default, it is empty.
  • By flag whether it runs in local or cluster mode (disabled by default, i.e. cluster is preferred)


Moreover, when creation is successful, the one INFO messages pop up in the logs. That is:
INFO Executor: Starting executor ID [executorId] on host [executorHostname]

. Heartbeater — Heartbeat Sender Thread

Basically, with a single thread, heartbeater is a daemon ScheduledThreadPoolExecutor.
We call this thread pool a driver-heartbeater.

. Launching Task — launchTask Method

By using this method, we execute the input serializedTask task concurrently.

  1. launchTask(
  2. context: ExecutorBackend,
  3. taskId: Long,
  4. attemptNumber: Int,
  5. taskName: String,
  6. serializedTask: ByteBuffer): Unit


  1. launchTask(
  2. context: ExecutorBackend,
  3. taskId: Long,
  4. attemptNumber: Int,
  5. taskName: String,
  6. serializedTask: ByteBuffer): Unit
  • Moreover, by using launchTask we use to create a TaskRunner, internally. Then, with the help of taskId, we register it in the runningTasks internal registry.
  • Afterwards, we execute it on “Executor task launch worker” thread pool.

. “Executor Task Launch Worker” Thread Pool — ThreadPool Property

  • Basically, To launch, by task launch worker id. It uses threadPool daemon cached thread pool.
  • Moreover, at the same time of creation of Spark Executor, threadPool is created. Also, shuts it down when it stops.