AWS Certified Data Engineer Associate DEA-C01 Practice Question
An e-commerce company's CloudTrail logs from multiple accounts are centralized in Amazon S3 at s3://audit-logs/AWSLogs/AccountID/CloudTrail/us-east-1/YYYY/MM/DD/. A Glue table named cloudtrail_logs is queried in Athena for the last 7 days, but each query still scans several terabytes because new partitions are only added by a nightly Glue crawler. Without moving or transforming the data, which action most effectively reduces query cost and latency?
Create an Amazon Redshift external schema and copy the CloudTrail data into a Redshift table with sort keys on eventTime for faster SQL queries.
Run an AWS Glue ETL job that converts the CloudTrail JSON files to compressed Parquet under a new S3 prefix and update analysts to query the new table.
Enable partition projection on the Glue table and define year, month, and day ranges so Athena automatically discovers new partitions at query time.
Stream CloudTrail events to CloudWatch Logs and instruct analysts to run their compliance queries with CloudWatch Logs Insights instead of Athena.
Using partition projection removes the need to load new partitions with a crawler. By defining year, month, and day ranges in table properties, Athena projects partitions in memory, prunes irrelevant folders, and immediately recognizes new dates as they arrive. This sharply cuts the amount of data scanned and therefore the per-query cost and runtime. Converting the logs to Parquet would indeed lower scan size, but it introduces an additional ETL pipeline that the requirement forbids. Querying CloudWatch Logs Insights or copying data into Redshift changes the analytics service and involves extra ingestion or storage costs, without addressing the partition-discovery bottleneck for Athena.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is partition projection in AWS Glue, and how does it work?
Open an interactive chat with Bash
Why is converting data to Parquet not suitable in this case?
Open an interactive chat with Bash
How does Athena benefit from pruning irrelevant folders in large datasets?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Security and Governance
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99 $11.99
$11.99/mo
Billed monthly, Cancel any time.
$19.99 after promotion ends
3 Month Pass
$44.99 $26.99
$8.99/mo
One time purchase of $26.99, Does not auto-renew.
$44.99 after promotion ends
Save $18!
MOST POPULAR
Annual Pass
$119.99 $71.99
$5.99/mo
One time purchase of $71.99, Does not auto-renew.
$119.99 after promotion ends
Save $48!
BEST DEAL
Lifetime Pass
$189.99 $113.99
One time purchase, Good for life.
Save $76!
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .