GCP Professional Data Engineer Practice Question

Your data engineering team must orchestrate a daily batch pipeline that (1) launches two separate Dataflow jobs in parallel, (2) waits until both finish, (3) invokes a Cloud Function for data-quality validation, and (4) then loads the curated output into three BigQuery tables, one after another. The orchestration tool has to let you express these fan-out/fan-in dependencies in code, offer one-click back-fills, provide built-in task-retry policies and a graphical monitoring UI, while keeping infrastructure management to a minimum-no need to manually provision or patch scheduler clusters. Which Google Cloud service best fits these requirements?

  • Define the control flow in Google Cloud Workflows using YAML and call each service through REST APIs.

  • Deploy the pipeline as a Python-based Apache Airflow DAG in Cloud Composer, using its operators for Dataflow, Cloud Functions, and BigQuery.

  • Embed the orchestration logic inside a single batch Dataflow pipeline that invokes external services via side effects.

  • Create a set of Cloud Scheduler cron jobs that publish Pub/Sub messages to trigger Cloud Functions which, in turn, launch Dataflow templates and BigQuery loads.

GCP Professional Data Engineer
Ingesting and processing the data
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot