GCP Professional Data Engineer Practice Question

Your team is building a streaming pipeline that ingests sensor events from Pub/Sub, enriches them in Dataflow, and writes the results to a BigQuery table used by near-real-time dashboards. Compliance requires that each record contain a non-null device_id and that the temperature field be within −50 °C to 150 °C. Bad records must be captured for offline review without stopping the pipeline, and data quality violations above 2 % of the stream should raise an on-call alert. Which design best meets these requirements with the least operational overhead?

  • Write all events directly to BigQuery and schedule Dataform assertions every 10 minutes to query for null device_id or out-of-range temperatures; if violations exceed 2 %, send an alert and delete the bad rows.

  • Add a validation ParDo (or schema-aware Validate transform) in the Dataflow pipeline that routes records failing the device_id and temperature checks to a side output written to a dead-letter BigQuery table, while emitting Counter metrics exported to Cloud Monitoring with an alert set to trigger when invalid records exceed 2 % of throughput.

  • Invoke Cloud Data Loss Prevention (DLP) from Cloud Functions subscribed to Pub/Sub; if DLP finds any sensitive content or invalid temperature values, push the record to a separate Pub/Sub topic and alert via Cloud Tasks.

  • Disable streaming inserts and instead write all Pub/Sub data to Cloud Storage, then load it hourly into BigQuery with load jobs that rely on schema enforcement; examine any load job errors and configure Cloud Functions to page the team when loads fail.

GCP Professional Data Engineer
Designing data processing systems
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot