You are designing a BigQuery warehouse for an online-learning platform. Each test submission arrives as a JSON object that contains metadata about the submission (submission_id, student_id, timestamp, total_score) and an array of 40-60 questionResponse objects (question_id, is_correct, score). Analysts frequently need daily reports showing the average test score per student and occasionally need to drill into individual question responses for troubleshooting. You must minimize storage scanned and avoid joins during typical queries. Which table design best meets these requirements?
Create one table with a row per submission and an ARRAY<STRUCT<question_id INT64, is_correct BOOL, score FLOAT64>> column to store all question responses for that submission.
Create a wide, flattened table with one row per question response. Duplicate submission metadata columns across every row.
Create two tables: a submission fact table and a questionResponses dimension table keyed by submission_id, and join them at query time.
Store the raw JSON files in Cloud Storage and query them as an external table to avoid schema design changes.
Storing one table with a single row per submission and a repeated STRUCT that holds all questionResponse records keeps parent-level attributes and their children in the same physical row. BigQuery stores repeated and nested fields in a columnar layout that lets queries read only the parent columns when calculating the average total_score, scanning far less data than a flattened table. When analysts need individual questions, they can UNNEST the ARRAY. This approach eliminates the submission-question join that a normalized design would require and avoids column duplication that a fully flattened table would create. Querying raw JSON in Cloud Storage would avoid joins but would force BigQuery to read the entire object on every query and sacrifice performance optimizations such as partitioning and clustering.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is ARRAY<STRUCT> in BigQuery?
Open an interactive chat with Bash
How does UNNEST work in BigQuery?
Open an interactive chat with Bash
Why is querying raw JSON files in Cloud Storage less efficient than using a structured BigQuery table?
Open an interactive chat with Bash
What is an ARRAY<STRUCT> in BigQuery, and why is it useful?
Open an interactive chat with Bash
How does UNNEST work in BigQuery, and when is it needed?
Open an interactive chat with Bash
Why is querying raw JSON in Cloud Storage not efficient for this use case?
Open an interactive chat with Bash
Why is using a STRUCT with an ARRAY recommended over a flattened table in BigQuery for this scenario?
Open an interactive chat with Bash
What does the UNNEST function do in BigQuery?
Open an interactive chat with Bash
Why is querying raw JSON from Cloud Storage inefficient compared to structured tables in BigQuery?
Open an interactive chat with Bash
GCP Professional Data Engineer
Storing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99 $11.99
$11.99/mo
Billed monthly, Cancel any time.
$19.99 after promotion ends
3 Month Pass
$44.99 $26.99
$8.99/mo
One time purchase of $26.99, Does not auto-renew.
$44.99 after promotion ends
Save $18!
MOST POPULAR
Annual Pass
$119.99 $71.99
$5.99/mo
One time purchase of $71.99, Does not auto-renew.
$119.99 after promotion ends
Save $48!
BEST DEAL
Lifetime Pass
$189.99 $113.99
One time purchase, Good for life.
Save $76!
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .