HomeInterview QuestionsWhich AWS platform is used to run and transform Sp…

Which AWS platform is used to run and transform Spark jobs, and what parameters are required to trigger a Spark transformation?

🟡 Medium Conceptual Junior level
1Times asked
May 2026Last seen
May 2026First seen

💡 Model Answer

AWS Glue and Amazon EMR are the two primary services that run Apache Spark workloads in the AWS ecosystem. Glue is a fully managed ETL service that uses a serverless Spark runtime; you submit a Glue job that points to a Python or Scala script, and Glue handles cluster provisioning, scaling, and job scheduling. EMR, on the other hand, gives you a managed Hadoop ecosystem where you can launch a Spark cluster, install custom libraries, and run Spark jobs in a more traditional cluster‑centric way.

To trigger a Spark transformation in either service you need to provide several key parameters:

  1. Job name – a unique identifier for the job.
  2. Script location – the S3 path to the Spark script (Python/Scala).
  3. IAM role – a role that grants Glue/EMR permissions to read/write S3, CloudWatch, etc.
  4. Spark configuration – optional key/value pairs such as --conf spark.executor.memory=4g or --conf spark.sql.shuffle.partitions=200.
  5. Deployment mode--deploy-mode cluster for EMR or the Glue job type (Spark, Python, etc.).
  6. Resource specifications – for EMR you set the number of EC2 instances, instance type, and EBS size; for Glue you set the DPUs.
  7. Parameters – any job‑specific arguments you want to pass to the script.

Once these parameters are supplied, the job is submitted via the AWS console, CLI (aws glue start-job-run or aws emr add-steps), or SDK, and the Spark engine executes the transformation on the data in S3 or other data stores. The service then streams logs to CloudWatch and returns a job run ID for monitoring.

This answer was generated by AI for study purposes. Use it as a starting point — personalize it with your own experience.

🎤 Get questions like this answered in real-time

Assisting AI listens to your interview, captures questions live, and gives you instant AI-powered answers — invisible to screen sharing.

Get Assisting AI — Starts at ₹500