AWS Certified Data Engineer Associate DEA-C01 Practice Question

A media company performs a weekly Spark ETL job on 40 TB of log files stored in Amazon S3. Intermediate shuffle data is needed only while the job runs; after the job finishes, the cluster is terminated. The team wants to minimize storage costs yet maintain high I/O throughput for the shuffle phase. Which solution meets these requirements?

Launch an Amazon EMR cluster that uses EC2 instances with locally attached NVMe instance store volumes as HDFS storage, and avoid provisioning additional EBS volumes.
Create an Amazon EMR cluster and attach gp3 EBS volumes sized to store the entire 40 TB dataset; enable EMRFS consistent view for metadata operations.
Copy the data into an Amazon EFS file system and run the ETL using Spark containers on AWS Fargate.
Load the data into an Amazon Redshift RA3 cluster with managed storage and perform the transformation using Redshift Spectrum.

AWS Certified Data Engineer Associate DEA-C01

Data Store Management

Your Score:

Bash, the Crucial Exams Chat Bot

AI Bot

AWS Certified Data Engineer Associate DEA-C01 Practice Question

Answer Description

Ask Bash

What is an instance store volume in AWS?

Why is HDFS on instance store volumes preferred for Spark shuffle operations?

How does Amazon EMR use instance store volumes for HDFS?

Monthly

$19.99 $11.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99 $26.99

One time purchase of $26.99,
Does not auto-renew.

Annual Pass

$119.99 $71.99

One time purchase of $71.99,
Does not auto-renew.

Lifetime Pass

$189.99 $113.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

AWS Certified Data Engineer Associate DEA-C01 Practice Question

Report Issue

Answer Description

Ask Bash

What is an instance store volume in AWS?

Why is HDFS on instance store volumes preferred for Spark shuffle operations?

How does Amazon EMR use instance store volumes for HDFS?

Report Issue