AWS Certified Data Engineer Associate DEA-C01 Practice Question

Your company stores about 14 TB of CloudTrail JSON logs daily in an S3 bucket. Auditors need ad-hoc Athena queries on the last 90 days while minimizing storage and scan cost. The job must convert the logs to partitioned Parquet each night, update the Glue Data Catalog, respect existing Lake Formation table permissions, and avoid cluster management by scaling automatically. Which solution meets these requirements?

Schedule an Amazon EMR Serverless Spark job with EventBridge to convert the JSON logs to date-partitioned Parquet in S3, update the Glue Data Catalog, and rely on Lake Formation for table governance.
Maintain a persistent EMR on EC2 cluster with a cron-based Spark step that converts and catalogs the logs and scales the cluster with EMR auto-scaling policies.
Trigger nightly Athena CREATE TABLE AS SELECT statements from an AWS Lambda function to convert the JSON logs to Parquet and add partitions with MSCK REPAIR TABLE.
Enable CloudTrail Lake, import the S3 log bucket each night, and let auditors query the event data store while exporting results to Athena when necessary.

AWS Certified Data Engineer Associate DEA-C01

Data Security and Governance

Your Score:

Bash, the Crucial Exams Chat Bot

AI Bot

AWS Certified Data Engineer Associate DEA-C01 Practice Question

Answer Description

Ask Bash

What is Amazon EMR Serverless?

Why use Parquet instead of JSON for data storage in this scenario?

How does Lake Formation integrate with Glue Data Catalog for governance?

Monthly

$19.99 $11.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99 $26.99

One time purchase of $26.99,
Does not auto-renew.

Annual Pass

$119.99 $71.99

One time purchase of $71.99,
Does not auto-renew.

Lifetime Pass

$189.99 $113.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

AWS Certified Data Engineer Associate DEA-C01 Practice Question

Report Issue

Answer Description

Ask Bash

What is Amazon EMR Serverless?

Why use Parquet instead of JSON for data storage in this scenario?

How does Lake Formation integrate with Glue Data Catalog for governance?

Report Issue