AWS Certified Data Engineer Associate DEA-C01 Practice Question
An ecommerce company keeps 3 years of web-server logs as uncompressed .txt files in the s3://company-data/logs/ prefix. Data analysts must run interactive ad-hoc SQL queries against only the most recent 90 days of logs. The solution must minimize query cost, leave the raw files unchanged, and avoid managing long-running infrastructure. Which approach best meets these requirements?
Copy the most recent 90 days of logs into an Amazon Redshift cluster and pause the cluster when queries are finished.
Create external tables in Amazon Athena that reference the existing .txt files and add day-based partitions for the last 90 days.
Import all .txt logs into an Amazon RDS for PostgreSQL instance with auto-scaling storage and index the timestamp column.
Use an AWS Glue ETL job to convert the latest 90 days of .txt logs to compressed Parquet files in a separate S3 prefix and query that prefix with Amazon Athena.
Converting the most recent 90-day slice of the .txt logs to a columnar, compressed format such as Parquet sharply reduces the amount of data that Amazon Athena needs to scan, lowering both latency and per-query cost. A serverless AWS Glue job can write the converted data to a new S3 prefix, so the original uncompressed text files remain untouched. Athena can then query only the Parquet objects without requiring an always-on cluster. Pointing Athena directly at the .txt files still incurs high scan costs even with partitioning, while Amazon Redshift or Amazon RDS would require provisioning and managing database instances, increasing operational overhead and cost.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why is Parquet preferred over .txt files for querying data in Athena?
Open an interactive chat with Bash
What is AWS Glue and how does it relate to ETL jobs?
Open an interactive chat with Bash
How does Amazon Athena minimize infrastructure management?
Open an interactive chat with Bash
Why is Parquet preferred over .txt files for Athena queries?
Open an interactive chat with Bash
What is AWS Glue and how does it help in this solution?
Open an interactive chat with Bash
How does Amazon Athena work with stored data in S3?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .