AWS Certified Data Engineer Associate DEA-C01 Practice Question

A media company uses an S3 data lake. CSV files are delivered every hour to the prefix s3://company-raw/year=/month=/day=

/. A data engineer must convert each new batch to Apache Parquet, partitioned by the same date keys, and catalog the resulting tables so they are queryable in Amazon Athena. The solution must:

avoid re-processing files that were already converted
scale without provisioning or managing servers
require the least custom code

Which approach meets these requirements MOST cost-effectively?

Configure an AWS Lambda function triggered by S3 ObjectCreated events that converts each CSV file to Parquet, writes it to the curated bucket, and uses the Athena API to add partitions.
Launch an AWS Glue Python shell job on an hourly schedule that reads the CSV files with pandas, converts them to Parquet, and writes the results to the curated prefix.
Create an AWS Glue Spark ETL job that reads from the raw S3 prefix, enables job bookmarks, writes the output in Parquet to an s3://company-curated/ prefix partitioned by year, month, and day, and updates the AWS Glue Data Catalog on each run.
Set up an Amazon Kinesis Data Firehose delivery stream with an S3 source and Parquet output conversion enabled, then point it at the raw bucket prefix.

AWS Certified Data Engineer Associate DEA-C01

Data Ingestion and Transformation

Your Score:

Bash, the Crucial Exams Chat Bot

AI Bot

AWS Certified Data Engineer Associate DEA-C01 Practice Question

Answer Description

Ask Bash

What are AWS Glue job bookmarks?

Why is Apache Parquet used instead of CSV in data lakes?

How does AWS Glue integrate with Amazon Athena for querying data?

Monthly

$19.99 $11.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99 $26.99

One time purchase of $26.99,
Does not auto-renew.

Annual Pass

$119.99 $71.99

One time purchase of $71.99,
Does not auto-renew.

Lifetime Pass

$189.99 $113.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

AWS Certified Data Engineer Associate DEA-C01 Practice Question

Report Issue

Answer Description

Ask Bash

What are AWS Glue job bookmarks?

Why is Apache Parquet used instead of CSV in data lakes?

How does AWS Glue integrate with Amazon Athena for querying data?

Report Issue