GCP Professional Data Engineer Practice Question

You ingest 50 million new clickstream rows per day into a partitioned and clustered BigQuery table called proj.analytics.events. A Looker Studio dashboard repeatedly runs this seven-day KPI query:

SELECT DATE(event_timestamp) AS event_date, country, COUNT(*) AS sessions FROM proj.analytics.events WHERE event_timestamp >= TIMESTAMP_SUB(CURRENT_TIMESTAMP(), INTERVAL 7 DAY) GROUP BY event_date, country;

Dashboard latency now exceeds 4 seconds. You already enabled BI Engine, but the working set no longer fits in memory. Without changing any dashboard SQL, which action will most effectively cut both latency and cost?

  • Purchase a larger BI Engine reservation so the last seven days of raw data are fully cached in memory.

  • Move the table to Cloud Storage as a BigLake table and enable table-level result caching for the dashboard queries.

  • Create a materialized view that implements the same aggregation and rely on BigQuery's automatic incremental refresh and query-rewrite.

  • Replace COUNT(*) with APPROX_COUNT_DISTINCT and add a LIMIT clause to all dashboard queries.

GCP Professional Data Engineer
Preparing and using data for analysis
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot