AWS Certified Data Engineer Associate DEA-C01 Practice Question
An energy company ingests about 10,000 JSON readings per second from smart meters (≤1 KB each). The operations team needs the last 7 days of data available for dashboard queries that retrieve a meter's readings by time range in single-digit milliseconds. They want minimal administration, automatic scaling, and automatic expiry. Which storage solution is MOST cost-effective?
Store the data in an Amazon RDS for MySQL instance with provisioned IOPS and schedule nightly DELETE statements to purge rows older than 7 days.
Use the Amazon Redshift streaming ingestion API to load data into a cluster and create materialized views for the dashboard queries.
Stream the records into Amazon Kinesis Data Streams, deliver them to Amazon S3, and run dashboard queries with Amazon Athena.
Create an Amazon DynamoDB table in on-demand mode with meter_id as the partition key, timestamp as the sort key, and enable TTL to delete items after 7 days.
Amazon DynamoDB is the best solution because it meets all requirements. It delivers the required single-digit millisecond latency for the dashboard queries, and its on-demand capacity mode automatically scales to handle the high-velocity ingestion with no manual intervention. DynamoDB Time to Live (TTL) is a feature that automatically expires items after a specified time, satisfying the 7-day retention and automatic expiry requirement at no extra cost.
Storing data in an Amazon RDS for MySQL instance would involve higher operational overhead. It requires manual instance sizing, index maintenance, and scheduling jobs to delete old data, which is less scalable and more expensive for this workload.
Using Amazon S3 with Amazon Athena is unsuitable because Athena's query latency is typically measured in seconds, failing the single-digit millisecond performance requirement. It is also not cost-effective for a high volume of small, frequent queries.
Amazon Redshift is a data warehouse built for complex analytical queries (OLAP), not the low-latency key-value lookups needed here. Even with streaming ingestion, it cannot reliably meet the single-digit millisecond latency requirement and would be more expensive to operate for this access pattern.
The proposed DynamoDB table design with meter_id as the partition key and timestamp as the sort key is perfectly optimized for efficiently retrieving a meter's readings within a specific time range.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Can you explain what DynamoDB TTL is and how it works?
Open an interactive chat with Bash
How does DynamoDB's on-demand mode enable automatic scaling?
Open an interactive chat with Bash
Why is DynamoDB better suited for low-latency queries compared to Amazon Athena or Redshift?
Open an interactive chat with Bash
How does DynamoDB achieve single-digit millisecond latency for queries?
Open an interactive chat with Bash
What is TTL in DynamoDB and how does it work?
Open an interactive chat with Bash
What is DynamoDB's on-demand capacity mode and why is it suitable here?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .