AWS Certified Data Engineer Associate DEA-C01 Practice Question
A company needs to perform a nightly ETL job that executes a custom PySpark script stored in Amazon S3. The solution must let the team run additional Bash commands during cluster startup to install a third-party library, automatically provision and terminate compute nodes, and scale workers based on the current job's YARN resource usage. Which AWS service best meets these requirements?
Amazon EMR can launch transient clusters, submit PySpark steps that read from and write to Amazon S3, run user-supplied Bash or Python bootstrap actions at startup to install custom software, scale worker nodes automatically by monitoring YARN metrics through EMR Managed Scaling, and terminate the cluster when all steps finish.
AWS Glue also runs PySpark, but it is a serverless job environment that does not expose cluster-level bootstrap scripting or YARN-based autoscaling.
Amazon Athena provides serverless SQL and interactive Apache Spark notebooks, but it offers no way to run startup scripts, install system packages on workers, or tune resource-based autoscaling.
Amazon Redshift is a managed data warehouse with SQL UDFs and no Spark engine, bootstrap actions, or YARN scheduler.
Therefore, Amazon EMR is the only service that satisfies every stated requirement.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is EMR Managed Scaling?
Open an interactive chat with Bash
How do bootstrap actions work in Amazon EMR?
Open an interactive chat with Bash
Why doesn't AWS Glue support YARN-based autoscaling?
Open an interactive chat with Bash
What is YARN resource usage in Amazon EMR?
Open an interactive chat with Bash
How do bootstrap actions work in Amazon EMR?
Open an interactive chat with Bash
What is the advantage of transient clusters in Amazon EMR over persistent ones?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Operations and Support
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .