Spark (Con’t)

Execution step
- Code written, execute
- Driver Program launch
- Cluster manager provision resources
- At same time, spark context created. (aka Action initiated)
- Create a job using DAG as a pipeline (aka RDD lineage)
- Each item is call Task :: Partition and distributed to worker node
- Program are distributed into worker node

Leave a comment