What will happen if a Spark job fails at 50% completion?
💡 Model Answer
When a Spark job fails partway through, Spark’s fault‑tolerance mechanism kicks in. Each stage is divided into tasks that run on executors. If a task fails, Spark automatically retries it up to the configured retry limit (spark.task.maxFailures). If the failure is due to an executor crash or a node loss, the entire stage is re‑scheduled on a healthy executor. The driver keeps track of the lineage of RDDs or DataFrames; if a stage cannot be completed, Spark will recompute all upstream stages that produced the missing data. In the worst case, if the driver itself crashes, the whole job is aborted and the application must be restarted. Spark does not roll back the work that was already completed; it simply recomputes the missing parts. Therefore, a 50% failure typically results in retries of failed tasks, possible stage re‑execution, and eventual job completion if the failures are transient and the retry limits are not exceeded.
This answer was generated by AI for study purposes. Use it as a starting point — personalize it with your own experience.
🎤 Get questions like this answered in real-time
Assisting AI listens to your interview, captures questions live, and gives you instant AI-powered answers — invisible to screen sharing.
Get Assisting AI — Starts at ₹500