GCP Professional Data Engineer Practice Question

Your organization runs a streaming Apache Beam pipeline on Cloud Dataflow. The job is launched from Cloud Composer with the parameters --zone=us-central1-a and processes Pub/Sub messages into a BigQuery table. Twice last quarter, Google maintenance in us-central1-a caused the worker pool to become unavailable and the pipeline stopped until it was manually restarted. You must redesign the deployment so that the job automatically survives a zonal outage, requires no code changes, and adds as little operational overhead as possible. What should you do?

  • Deploy an identical standby Dataflow job in us-central1-b and let Pub/Sub round-robin messages between the two jobs.

  • Start the pipeline specifying only --region=us-central1; omit any zone flags so Dataflow can distribute and, if needed, relocate workers across multiple zones in the region.

  • Port the pipeline to Dataproc and create a regional cluster with three master nodes spread across zones, then schedule the Spark Streaming job there.

  • Move the target BigQuery table to a multi-regional location so queries remain available even if a zone fails.

GCP Professional Data Engineer
Maintaining and automating data workloads
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot