AWS Certified Data Engineer Associate DEA-C01 Practice Question
A company runs a daily Apache Spark job that converts 500 GB of CSV files in Amazon S3 to Parquet. The job currently runs on an Amazon EMR cluster that remains active for the rest of the day, incurring charges even when it is idle. The data engineering team must reduce processing costs but keep similar performance and avoid managing long-running infrastructure. Which approach meets these requirements MOST cost-effectively?
Move the job to an AWS Glue Spark ETL job that runs on a schedule with a fixed DPU allocation.
Re-implement the workload as an Amazon EMR Serverless application and submit the Spark job on a daily schedule.
Keep the EMR cluster but enable EMR managed scaling with Spot Instances for task nodes.
Convert the cluster to Amazon EKS and run the Spark job as a Kubernetes pod using EMR on EKS.
Amazon EMR Serverless removes the need to provision or manage clusters. It automatically supplies Spark executors when the job starts and releases resources immediately after the job finishes, so no charges accrue during the 20-hour idle period. Performance is comparable because the same Spark code and Amazon S3 data source are used.
Replacing the cluster with an AWS Glue job could lower cost, but it requires porting code and still bills for all provisioned DPUs during runtime, even if partially utilized. Enabling EMR managed scaling or switching to Spot Instances reduces, but does not eliminate, idle charges because the primary node must stay active. Therefore, EMR Serverless yields the greatest cost savings while meeting the performance and operational requirements.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Amazon EMR Serverless?
Open an interactive chat with Bash
Why is Amazon EMR Serverless more cost-effective than EMR with managed scaling?
Open an interactive chat with Bash
What are the limitations of using AWS Glue for this Spark workload?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Ingestion and Transformation
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99 $11.99
$11.99/mo
Billed monthly, Cancel any time.
$19.99 after promotion ends
3 Month Pass
$44.99 $26.99
$8.99/mo
One time purchase of $26.99, Does not auto-renew.
$44.99 after promotion ends
Save $18!
MOST POPULAR
Annual Pass
$119.99 $71.99
$5.99/mo
One time purchase of $71.99, Does not auto-renew.
$119.99 after promotion ends
Save $48!
BEST DEAL
Lifetime Pass
$189.99 $113.99
One time purchase, Good for life.
Save $76!
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .