AWS Certified Data Engineer Associate DEA-C01 Practice Question

An analytics team stores 2 TB of GZIP-compressed JSON click-stream data in Amazon S3. The data engineer needs a web notebook to interactively run PySpark, inspect samples, and write partitioned Parquet back to S3, without managing clusters and paying only when code executes. Which AWS approach satisfies these goals?

Launch an Amazon EMR cluster with Apache Zeppelin notebooks and configure the cluster to auto-terminate when idle.
Create an Athena for Apache Spark notebook and use the interactive session to transform and write the data.
Use AWS Glue DataBrew to profile the JSON files and export the results as Parquet.
Build an on-demand AWS Glue PySpark job that is started by AWS Step Functions when transformation is required.

AWS Certified Data Engineer Associate DEA-C01

Data Operations and Support

Your Score:

Bash, the Crucial Exams Chat Bot

AI Bot

AWS Certified Data Engineer Associate DEA-C01 Practice Question

Answer Description

Ask Bash

What is Athena for Apache Spark?

What is Parquet and why is it used for data storage?

How does Athena for Apache Spark differ from Amazon EMR?

Monthly

$19.99 $11.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99 $26.99

One time purchase of $26.99,
Does not auto-renew.

Annual Pass

$119.99 $71.99

One time purchase of $71.99,
Does not auto-renew.

Lifetime Pass

$189.99 $113.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

AWS Certified Data Engineer Associate DEA-C01 Practice Question

Report Issue

Answer Description

Ask Bash

What is Athena for Apache Spark?

What is Parquet and why is it used for data storage?

How does Athena for Apache Spark differ from Amazon EMR?

Report Issue