Your team is building a BigQuery data warehouse for an e-commerce site that records 80 million clickstream events every day. Analysts frequently join these events with the product catalog to calculate conversion rates by brand and style. The product catalog contains about 50 000 rows and its descriptive attributes (price, color, size availability) are updated several times a week. The business wants:
Stable query performance for exploratory SQL
Minimal operational work when catalog attributes change Which table design best meets these requirements?
Fully flatten the product attributes into the clickstream fact table on load; reload the entire fact table each time the catalog is updated to keep data in sync.
Embed the full set of product attributes as nested, repeated fields inside every clickstream event and update affected event rows whenever the catalog changes.
Store clickstream events in a partitioned, denormalized fact table and maintain the product catalog as a separate dimension table that analysts join at query time.
Normalize the schema into multiple dimension tables (product, brand, color) and create a materialized view that joins them to the clickstream fact table on a schedule.
Keeping the massive, append-only clickstream data in a partitioned fact table while storing the much smaller and frequently updated product catalog in its own dimension table minimizes operational effort: catalog changes require updates only to a 50 000-row table instead of rewriting billions of fact rows. BigQuery can efficiently broadcast the small dimension table during joins, so ad-hoc queries still perform well. Fully denormalizing the catalog into the clickstream (either as flat columns or nested records) would force costly rewrites whenever attributes change, while a more complex snowflake plus materialized view adds maintenance overhead without clear benefit.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is a partitioned table in BigQuery?
Open an interactive chat with Bash
What is a dimension table, and how is it used in a data warehouse?
Open an interactive chat with Bash
Why does BigQuery use broadcast joins for small dimension tables?
Open an interactive chat with Bash
What is denormalization and why is it important in data warehouse design?
Open an interactive chat with Bash
How does BigQuery's partitioning improve query performance for large datasets?
Open an interactive chat with Bash
What makes dimension tables efficient for frequent updates compared to denormalized fact tables?
Open an interactive chat with Bash
GCP Professional Data Engineer
Storing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .