Your team ingests 15 TB of compressed application logs into Cloud Storage every night and immediately loads the data into a BigQuery staging table. A batch Dataflow pipeline then executes a series of SQL‐like joins, filters, and aggregations before writing the daily results into BigQuery reporting tables. The Dataflow job's worker and shuffle costs have grown significantly, and the team wants to reduce operational overhead while keeping the transformation logic in ANSI-compatible SQL under version control. What should you recommend?
Re-implement the pipeline with Dataflow SQL templates and trigger them nightly with Cloud Scheduler.
Move the transformation logic into BigQuery by creating version-controlled SQL files managed with Dataform or scheduled queries, and drop the Dataflow job.
Keep the Dataflow pipeline but orchestrate it with Cloud Data Fusion to simplify management.
Replace the Dataflow job with a Dataproc cluster that runs Spark SQL notebooks scheduled by Cloud Composer.
BigQuery can perform ELT-style transformations directly on the data that already resides in its managed storage layer, eliminating Dataflow worker and shuffle charges. Scheduling the statements as stored procedures or through Dataform lets the team keep version-controlled SQL files and execute them on a defined cadence. Dataproc or self-managed Spark increases operational overhead; Dataflow SQL still incurs Dataflow execution costs; Cloud Data Fusion orchestration does not remove the underlying Dataflow runtime cost.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
How does BigQuery handle ELT-style transformations?
Open an interactive chat with Bash
What is Dataform, and how does it work with BigQuery?
Open an interactive chat with Bash
Why is Dataflow no longer ideal in this pipeline for transformations?
Open an interactive chat with Bash
What is the purpose of Dataform in BigQuery transformations?
Open an interactive chat with Bash
How does BigQuery eliminate Dataflow shuffle charges in ELT pipelines?
Open an interactive chat with Bash
What are the advantages of using version-controlled SQL files for transformations?
Open an interactive chat with Bash
GCP Professional Data Engineer
Ingesting and processing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .