AWS Certified Data Engineer Associate DEA-C01 Practice Question
A company loads 2 TB of time-series sensor events into an Amazon Redshift table every day by appending new rows. Business intelligence dashboards filter on event_date for the most recent 7 days and aggregate results by device_id, joining to a small device metadata table. The current heap table causes long scan times. Which schema change will most effectively reduce dashboard latency without adding load-time complexity?
Create an external Redshift Spectrum table partitioned by event_date and load new partitions daily.
Convert the table to a compound sort key of event_date, device_id and use device_id as the DISTKEY.
Leave the heap layout unchanged but schedule VACUUM and ANALYZE to run after each daily load.
Change the table to EVEN distribution and define an INTERLEAVED sort key on device_id, event_date.
Dashboards always apply a range predicate on event_date, so physically co-locating recent dates together is the primary performance need. Converting the table to a compound sort key with event_date as the leading column stores new data in the same order it is inserted, letting Redshift skip blocks that fall outside the 7-day window while avoiding the extra maintenance that interleaved keys require. Setting device_id as the DISTKEY places rows for the same device on the same node, improving the join to the small reference table with minimal overhead. Interleaved keys improve symmetry across columns but incur frequent reindexing costs and provide less benefit for time-based range filtering. EVEN distribution with an interleaved key still suffers from larger scans. Redshift does not support native table partitioning, so attempting to partition by event_date is not possible inside the cluster. Merely enabling VACUUM and ANALYZE maintenance does not address the fundamental lack of a sort key optimized for query predicates.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why is a compound sort key better for range filtering compared to an interleaved sort key in Amazon Redshift?
Open an interactive chat with Bash
What is the role of a DISTKEY in Amazon Redshift, and why is device_id suitable here?
Open an interactive chat with Bash
Why can't Redshift tables be partitioned like other databases, and what alternatives are provided?
Open an interactive chat with Bash
What is a compound sort key in Amazon Redshift?
Open an interactive chat with Bash
Why use a DISTKEY for `device_id` in this scenario?
Open an interactive chat with Bash
What are the drawbacks of interleaved sort keys in Redshift?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .