GCP Professional Data Engineer Practice Question

Your company is building a nightly Cloud Composer workflow that must (1) launch a Dataflow Flex Template job to load CSV files from Cloud Storage into a staging table, (2) execute a BigQuery SQL transformation only when the load succeeds, and (3) send an alert and prevent all downstream tasks from running if the load fails, without requiring manual cleanup before the next scheduled run. Which DAG design best satisfies these requirements?

  • Define a DataflowTemplateOperator followed by a BigQueryInsertJobOperator; set trigger_rule="all_success" on the BigQuery task, add an on_failure_callback to the Dataflow task that calls the PagerDuty API, and leave default scheduling so the next DAG run starts normally.

  • Use trigger_rule="all_done" on the BigQueryInsertJobOperator so it always executes; rely on an SLA miss notification to detect failures in the Dataflow task.

  • Insert a BranchPythonOperator after the Dataflow task that pushes XCom to decide whether to run the BigQuery task; schedule a manual DAG run if the branch chooses the failure path.

  • Wrap the three steps inside a SubDagOperator and set the SubDAG's trigger_rule to "all_success"; configure no additional callbacks and clear failed tasks manually before the next run.

GCP Professional Data Engineer
Ingesting and processing the data
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot