Your data engineering team must orchestrate a daily batch pipeline that (1) launches two separate Dataflow jobs in parallel, (2) waits until both finish, (3) invokes a Cloud Function for data-quality validation, and (4) then loads the curated output into three BigQuery tables, one after another. The orchestration tool has to let you express these fan-out/fan-in dependencies in code, offer one-click back-fills, provide built-in task-retry policies and a graphical monitoring UI, while keeping infrastructure management to a minimum-no need to manually provision or patch scheduler clusters. Which Google Cloud service best fits these requirements?
Define the control flow in Google Cloud Workflows using YAML and call each service through REST APIs.
Deploy the pipeline as a Python-based Apache Airflow DAG in Cloud Composer, using its operators for Dataflow, Cloud Functions, and BigQuery.
Embed the orchestration logic inside a single batch Dataflow pipeline that invokes external services via side effects.
Create a set of Cloud Scheduler cron jobs that publish Pub/Sub messages to trigger Cloud Functions which, in turn, launch Dataflow templates and BigQuery loads.
Cloud Composer supplies a managed Apache Airflow environment where workflows are defined as Python DAGs. Airflow's native operators for Dataflow, Cloud Functions, and BigQuery make it straightforward to model parallel Dataflow jobs, a downstream Cloud Function, and sequential BigQuery loads. Airflow offers built-in retries, rich graphical monitoring, and back-fill capabilities. Because Composer provisions and maintains the underlying Airflow deployment for you, only minimal environment administration is required, satisfying the team's desire to avoid hands-on cluster management.
Workflows can orchestrate API calls but uses YAML/JSON definitions, lacks an Airflow-style back-fill feature or native Dataflow/BigQuery operators, and provides only a basic execution UI. Cloud Scheduler with Pub/Sub does not manage multi-step dependencies. Embedding orchestration into a single Dataflow job conflates processing and scheduling and offers no graphical monitoring or back-fill. Therefore, Cloud Composer is the most suitable choice.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Apache Airflow, and why is it used in Cloud Composer for orchestration?
Open an interactive chat with Bash
What does 'fan-out/fan-in' mean in workflow orchestration?
Open an interactive chat with Bash
How does Airflow support back-fill capabilities?
Open an interactive chat with Bash
GCP Professional Data Engineer
Ingesting and processing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .