SPARK CORE
Define Apache Spark Core.
- Spark Core is the fundamental unit of the whole Spark project.
- It provides all sort of functionalities like
- task dispatching,
- input-output operations etc.
- Memory management,
- fault tolarance,
- scheduling and
- monitoring jobs,
- interacting with store systems are primary functionalities of Spark.
- basic connectivity with the data sources. For example, HBase, Amazon S3, HDFS etc.
- Spark makes use of Special data structure known as RDD (Resilient Distributed Dataset). It is the home for API that defines and manipulate the RDDs.
- Spark Core is distributed execution engine with all the functionality attached on its top. For example, MLlib, SparkSQL, GraphX, Spark Streaming.
- Thus, allows diverse workload on single platform.
- Apart from this Spark also provides the
- SparkCore is a base engine of apache spark framework.
The key features of Apache Spark Core are:
- It is in charge of essential I/O functionalities.
- Significant in programming and observing the role of the Spark cluster.
Task dispatching. - Fault recovery.
- It overcomes the snag of MapReduce by using in-memory computation.
What is Spark engine responsibility?
- Spark engine schedules, distributes and monitors the data application across the spark cluster.
- Generally, the Spark engine is concerned with establishing, spreading (distributing) and then monitoring the various sets of data spread around various clusters.