AWS Certified Data Engineer Associate DEA-C01 Practice Question

A data engineering team runs a nightly Apache Spark ETL job on a 10-node Amazon EMR cluster. The job finishes in 40 minutes, but the cluster sits idle all day. Input volume sometimes spikes, so the job needs extra executors. The team wants to cut costs without refactoring code and must keep using Spot Instances. Which solution meets these requirements with the LEAST operational effort?

  • Replace the cluster with Amazon EMR Serverless and specify a maximum capacity large enough for peak workloads.

  • Re-platform the workload to Amazon EMR on EKS and use Karpenter to auto-scale Spot worker nodes.

  • Enable EMR managed scaling and set the cluster to terminate automatically after 30 minutes of idle time.

  • Migrate the Spark application to AWS Glue and schedule a Glue Spark job that runs on-demand workers.

AWS Certified Data Engineer Associate DEA-C01
Data Operations and Support
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot