Your team stores 2 TB of clickstream data in a BigQuery table partitioned on the event_date column. Analysts frequently query the most recent 90 days for a single user_id, yet each query still scans entire daily partitions, driving up on-demand query costs. You need to reduce the bytes read without changing existing SQL, scheduling extra maintenance jobs, or incurring additional storage cost. Which approach best meets the requirement?
Add a clustering specification on user_id for the existing partitioned table.
Export the last 90 days of data to Cloud Storage and query it through external tables.
Create materialized daily tables per user_id and have analysts query only their specific table.
Re-partition the table using an integer-range partition on user_id instead of event_date.
Adding a clustering specification on user_id reorganizes the data inside every event_date partition so that rows with similar user_id values reside in the same data blocks. When a query includes a filter on user_id, BigQuery can prune blocks that do not match, dramatically lowering bytes scanned and therefore cost. Clustering has no extra storage charge and is maintained automatically, so it satisfies the constraints.
Using integer-range partitioning on user_id would require rebuilding the table, rewriting queries, and could create many thousands of partitions that exceed limits. Creating user-specific tables or exporting to Cloud Storage both add operational overhead and extra storage cost while not guaranteeing lower scanned bytes for existing queries.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is clustering in BigQuery, and how does it work?
Open an interactive chat with Bash
How do partitions and clustering differ in BigQuery?
Open an interactive chat with Bash
Why is integer-range partitioning not suitable in this scenario?
Open an interactive chat with Bash
What is partitioning in BigQuery?
Open an interactive chat with Bash
What is clustering in BigQuery?
Open an interactive chat with Bash
How does clustering differ from partitioning in BigQuery?
Open an interactive chat with Bash
What is table partitioning in BigQuery, and why is it useful?
Open an interactive chat with Bash
How does clustering in BigQuery work, and why does it reduce query costs?
Open an interactive chat with Bash
Why is 'integer-range partitioning' not recommended for this scenario?
Open an interactive chat with Bash
GCP Professional Data Engineer
Storing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .