A manufacturing company streams millisecond-level telemetry (temperature, pressure, event_timestamp, machine_id) from thousands of machines into BigQuery. Analysts must calculate mean time between failures per plant and quarter, enrich results with machine specifications (model, install_date) and plant attributes (country, business_unit), and frequently filter on event_timestamp ranges. To balance query performance, storage cost, and the ability to update machine or plant attributes without rewriting historical events, how should you map these business requirements to a warehouse data model?
Store telemetry in a fact table partitioned on event_timestamp and clustered by machine_id; place machine specifications and plant attributes in separate dimension tables referenced through surrogate keys.
Use a single table where machine and plant metadata are stored as nested STRUCTs inside every telemetry event record.
Create a single denormalized table that repeats all machine and plant attributes in every telemetry row to eliminate joins.
Model machine specifications as the fact table and treat each telemetry event as a dimension record linked to it.
The telemetry feed is high-volume, time-series data that analysts aggregate-ideal for a fact table. Partitioning the fact table by event_timestamp lets BigQuery prune partitions when queries filter by time, while clustering on machine_id accelerates point lookups and aggregations. Machine specifications and plant attributes are small, slowly changing reference datasets, so placing them in separate dimension tables (referenced by surrogate keys) avoids duplicating their values across billions of event rows, lowers storage costs, and lets you update or extend those attributes without touching historical telemetry. Duplicating or embedding the reference data inside every event row wastes storage and complicates updates, and inverting the model by making machines the fact table breaks the natural analytical grain needed for MTBF calculations.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is a fact table in a data warehouse?
Open an interactive chat with Bash
Why is partitioning and clustering important in BigQuery?
Open an interactive chat with Bash
What are surrogate keys, and why use them in dimension tables?
Open an interactive chat with Bash
GCP Professional Data Engineer
Storing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99 $11.99
$11.99/mo
Billed monthly, Cancel any time.
$19.99 after promotion ends
3 Month Pass
$44.99 $26.99
$8.99/mo
One time purchase of $26.99, Does not auto-renew.
$44.99 after promotion ends
Save $18!
MOST POPULAR
Annual Pass
$119.99 $71.99
$5.99/mo
One time purchase of $71.99, Does not auto-renew.
$119.99 after promotion ends
Save $48!
BEST DEAL
Lifetime Pass
$189.99 $113.99
One time purchase, Good for life.
Save $76!
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .