AWS Certified Data Engineer Associate DEA-C01 Practice Question

A data engineering team launches a transient Amazon EMR cluster each night through an AWS Step Functions workflow. Before any Spark job runs, the cluster must have a proprietary JDBC driver installed on every node. After installation, a PySpark ETL script stored in Amazon S3 must be executed. What is the most operationally efficient way to meet these requirements using native EMR scripting capabilities?

Build a custom AMI with the driver pre-installed and specify the PySpark ETL through classification properties when creating the cluster.
Configure a bootstrap action that downloads and installs the driver on all nodes, then add an EMR step that runs spark-submit on the PySpark script in Amazon S3.
Schedule an EMR Notebook that first installs the driver with pip commands and then executes the PySpark code, triggered nightly by a cron expression.
Pass a shell script to a Hadoop Streaming step that both installs the driver and calls the PySpark script in a single command.

AWS Certified Data Engineer Associate DEA-C01

Data Operations and Support

Your Score:

Bash, the Crucial Exams Chat Bot

AI Bot

AWS Certified Data Engineer Associate DEA-C01 Practice Question

Answer Description

Ask Bash

What is a bootstrap action in Amazon EMR?

How does EMR integrate with AWS Step Functions?

Why is using an EMR step with `spark-submit` operationally efficient?

Monthly

$19.99 $11.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99 $26.99

One time purchase of $26.99,
Does not auto-renew.

Annual Pass

$119.99 $71.99

One time purchase of $71.99,
Does not auto-renew.

Lifetime Pass

$189.99 $113.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

AWS Certified Data Engineer Associate DEA-C01 Practice Question

Report Issue

Answer Description

Ask Bash

What is a bootstrap action in Amazon EMR?

How does EMR integrate with AWS Step Functions?

Why is using an EMR step with `spark-submit` operationally efficient?

Report Issue