SPARK CORE

SPARK CORE

Define Apache Spark Core.

  • Spark Core is the fundamental unit of the whole Spark project.
  • It provides all sort of functionalities like
    • task dispatching,
    • input-output operations etc.
    • Memory management,
    • fault tolarance,
    • scheduling and
    • monitoring jobs,
    • interacting with store systems are primary functionalities of Spark.
    • basic connectivity with the data sources. For example, HBase, Amazon S3, HDFS etc.
  • Spark makes use of Special data structure known as RDD (Resilient Distributed Dataset). It is the home for API that defines and manipulate the RDDs.
  • Spark Core is distributed execution engine with all the functionality attached on its top. For example, MLlib, SparkSQL, GraphX, Spark Streaming.
  • Thus, allows diverse workload on single platform.
  • Apart from this Spark also provides the
  • SparkCore is a base engine of apache spark framework.

The key features of Apache Spark Core are:

  • It is in charge of essential I/O functionalities.
  • Significant in programming and observing the role of the Spark cluster.
    Task dispatching.
  • Fault recovery.
  • It overcomes the snag of MapReduce by using in-memory computation.

 

What is Spark engine responsibility?

  • Spark engine schedules, distributes and monitors the data application across the spark cluster.
  • Generally, the Spark engine is concerned with establishing, spreading (distributing) and then monitoring the various sets of data spread around various clusters.