GCP Professional Data Engineer Practice Question

You are designing BigQuery tables for an OTT video-analytics platform. A central fact table will capture billions of viewing-session events. User demographics and device characteristics change frequently and the same dimensions are shared by several other fact tables. The business wants to prevent data duplication, keep dimension updates lightweight, enforce a single source of truth, and accepts the overhead of an extra join at query time. Which schema design best meets these requirements in BigQuery?

  • Snapshot user and device data daily and append it, denormalized, to the fact table so that each event row contains the latest dimension values.

  • Shard the fact table by month and keep full user and device attributes in each shard to simplify partition pruning.

  • Create a snowflake schema in which the fact table keeps only surrogate keys and user and device details reside in separate, normalized dimension tables.

  • Store all user and device attributes directly in the fact table using nested and repeated fields to avoid joins.

GCP Professional Data Engineer
Storing the data
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot