AWS Certified Data Engineer Associate DEA-C01 Practice Question
An analytics company ingests 2 TB of clickstream data per day into an Amazon Redshift RA3 cluster. The fact table click_events is queried mainly in these two ways:
Analysts run dashboards that filter on event_time for the last 7 days.
Ad-hoc queries join click_events to a 15-GB dimension table users on user_id. The users table already uses DISTKEY user_id and has no sort key. The engineering team must create the click_events table to deliver the most predictable query performance while keeping storage costs low. Which table definition meets these requirements?
Create click_events with DISTSTYLE AUTO and INTERLEAVED SORTKEY(user_id, event_time).
Create click_events with DISTKEY event_time and COMPOUND SORTKEY(user_id, event_time).
Create click_events with DISTSTYLE ALL and no sort key.
Create click_events with DISTKEY user_id and COMPOUND SORTKEY(event_time, user_id).
The workload needs locality for the frequent join with users and efficient pruning for the typical time-range filter. Making user_id the DISTKEY on click_events collocates the fact and dimension rows on the same node slices, eliminating the network shuffle during joins. A compound SORTKEY with event_time leading (for example, SORTKEY(event_time, user_id)) organizes data so that blocks for recent time ranges are contiguous; the Redshift query engine can skip older blocks, minimizing I/O. Including user_id as the second column in the compound key maintains reasonable clustering for equality predicates without adding the metadata overhead of an interleaved key. Using DISTSTYLE ALL would replicate several terabytes of data to every node, dramatically increasing storage cost, while DISTSTYLE AUTO or EVEN would forfeit the collocated join benefit. Choosing event_time as the DISTKEY would not improve the join and would distribute data unevenly. Therefore, defining click_events with DISTKEY user_id and a compound SORTKEY beginning with event_time best satisfies both performance goals and cost efficiency.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why is user_id chosen as the DISTKEY for click_events?
Open an interactive chat with Bash
What is the benefit of using a COMPOUND SORTKEY starting with event_time?
Open an interactive chat with Bash
Why is DISTSTYLE ALL not suitable for the click_events table?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .