You are designing BigQuery tables for an OTT video-analytics platform. A central fact table will capture billions of viewing-session events. User demographics and device characteristics change frequently and the same dimensions are shared by several other fact tables. The business wants to prevent data duplication, keep dimension updates lightweight, enforce a single source of truth, and accepts the overhead of an extra join at query time. Which schema design best meets these requirements in BigQuery?
Create a snowflake schema in which the fact table keeps only surrogate keys and user and device details reside in separate, normalized dimension tables.
Store all user and device attributes directly in the fact table using nested and repeated fields to avoid joins.
Snapshot user and device data daily and append it, denormalized, to the fact table so that each event row contains the latest dimension values.
Shard the fact table by month and keep full user and device attributes in each shard to simplify partition pruning.
Normalizing frequently updated, reusable dimensions into their own tables and storing only surrogate keys in the fact table forms a snowflake schema. Because dimension data is maintained once, updates do not require rewriting the massive fact tables, reducing storage and ETL costs while preserving referential integrity. Denormalizing everything into a wide or nested fact table increases duplication and forces large-scale rewrites when attributes change. Sharding or snapshotting the entire denormalized data similarly duplicates information and inflates maintenance effort. External BigLake tables do not directly address the objective of minimizing updates through normalization. Therefore, a snowflake schema with separate dimension tables is the most appropriate choice.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is a snowflake schema?
Open an interactive chat with Bash
Why is denormalization not suitable for frequently updated data?
Open an interactive chat with Bash
What is the benefit of using surrogate keys in a fact table?
Open an interactive chat with Bash
What is a snowflake schema in database design?
Open an interactive chat with Bash
How do surrogate keys help in database schema design?
Open an interactive chat with Bash
What are the benefits of using normalized dimension tables in BigQuery?
Open an interactive chat with Bash
GCP Professional Data Engineer
Storing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .