GCP Professional Data Engineer Practice Question

Your retail company runs a nightly Dataflow batch job that loads 3 TB of CSV files from a Cloud Storage bucket in europe-west1 into BigQuery. Corporate policy mandates that if the primary region becomes unavailable the pipeline must be restarted in another region within 30 minutes, without recompiling or rebuilding the code. Which design best satisfies this disaster-recovery objective while aligning with Google-recommended practices?

  • Store the pipeline as a Dataflow Flex Template in a multi-regional Cloud Storage bucket, monitor the job with Cloud Monitoring, and trigger a Cloud Function to launch the same template in europe-west4 when the primary job fails.

  • Redesign the pipeline to use Cloud Spanner for state management; Spanner's multi-region replication will allow the existing job to keep running even if europe-west1 fails.

  • Start the job with the --automaticFailover flag so Dataflow transparently restarts the pipeline in the nearest healthy region during an outage.

  • Convert the batch job to a streaming pipeline, enable hourly Dataflow snapshots, and restore the latest snapshot in europe-west4 if europe-west1 becomes unavailable.

GCP Professional Data Engineer
Designing data processing systems
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot