AWS Certified Data Engineer Associate DEA-C01 Practice Question
A company ingests web clickstream logs into an S3 data lake. Analysts query the data with Amazon Athena. Queries typically target the most recent seven days but occasionally scan months of historical data. The data volume is about 5 TB per day. What is the most cost-effective way to organize the data in S3 to minimize Athena query runtimes and scan costs?
Store the logs as GZIP-compressed CSV files in a single prefix without partitions.
Convert the logs to ORC format but leave compression disabled to maximize read speed.
Convert the logs to columnar Parquet files, compress them, and partition the S3 prefix by event date (year/month/day).
Partition the logs by user ID and keep them as uncompressed JSON lines.
Athena charges by the amount of data it scans. Converting raw logs to a compressed columnar format such as Parquet greatly reduces the number of bytes that must be read, and columnar storage limits I/O to only the queried columns. Adding a day-based partition scheme (for example, year=YYYY/month=MM/day=DD/) enables Athena's partition pruning so that queries that filter on recent dates only read the partitions that are needed. Storing unpartitioned CSV or JSON forces Athena to scan every file, and using non-date partitions does not match the common filter pattern, eliminating the benefit. Disabling compression negates a key cost-saving technique.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Parquet, and why is it suitable for Athena?
Open an interactive chat with Bash
How does partitioning in S3 improve Athena query efficiency?
Open an interactive chat with Bash
What is the difference between compressed and uncompressed data in Athena?
Open an interactive chat with Bash
Why is Parquet a better choice for Athena queries compared to CSV or JSON?
Open an interactive chat with Bash
How does partitioning improve query performance in Amazon Athena?
Open an interactive chat with Bash
What is the advantage of compressing Parquet files for Athena queries?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .