AWS Certified Data Engineer Associate DEA-C01 Practice Question

A company stores six months of IoT sensor readings as GZIP-compressed CSV files in an Amazon S3 data lake. Business analysts use Amazon Athena to run ad-hoc queries several times per day and are concerned about high query latency and the cost of data scanned. Without changing the SQL that analysts run, which approach will most effectively reduce both latency and Athena query costs?

  • Split the existing CSV files into 128 MB objects to increase parallelism when Athena reads them.

  • Load the data into an Amazon RDS database and access it through Athena federated queries.

  • Convert the CSV files to columnar Parquet format and compress them with Snappy before storing them in S3.

  • Re-encode the files as line-delimited JSON and keep them GZIP-compressed in S3.

AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot