A digital retailer ingests 2 TB of raw click-stream logs into Cloud Storage every night. Analysts frequently refine attribution logic and must be able to replay all historical data within hours without managing long-running clusters. To minimize operational overhead and avoid repeated data movement, which integration pattern and execution environment should you recommend for the transformation stage of the pipeline?
Keep existing BigQuery tables as the source and copy them into Cloud SQL so downstream applications can consume them (reverse ETL).
Load the raw logs into BigQuery and perform all cleansing and attribution logic there using SQL (ELT).
Spin up a transient Dataproc cluster each night to transform the logs before loading the curated results into BigQuery (ETL).
Stream the logs through Pub/Sub into a Dataflow job that transforms and writes the output to BigQuery in near real time (streaming ETL).
Loading the raw files directly into BigQuery and then applying successive SQL transformations follows an ELT pattern: extract from the source, load into the analytical store, and transform in place. Because BigQuery is a fully managed, serverless warehouse that scales storage and compute independently, analysts can rerun or add new transformation queries on all historical data quickly without provisioning clusters. Transforming data in Dataproc or Dataflow before loading is an ETL approach that still requires operating or scheduling processing infrastructure and re-running it whenever business rules change. Publishing data out of BigQuery to operational systems corresponds to reverse ETL, which does not address the need to flexibly reprocess historical raw data inside the warehouse.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is ELT and how is it different from ETL?
Open an interactive chat with Bash
Why is BigQuery recommended for transforming raw click-stream logs?
Open an interactive chat with Bash
What is reverse ETL, and why is it not suitable for this use case?
Open an interactive chat with Bash
What is ELT and how is it different from ETL?
Open an interactive chat with Bash
Why is BigQuery suitable for the ELT approach?
Open an interactive chat with Bash
What are the advantages of SQL-based transformation in BigQuery?
Open an interactive chat with Bash
What is the difference between ETL and ELT?
Open an interactive chat with Bash
Why is BigQuery suitable for ELT processes?
Open an interactive chat with Bash
What is the role of Cloud Storage in the pipeline?
Open an interactive chat with Bash
GCP Professional Data Engineer
Ingesting and processing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99 $11.99
$11.99/mo
Billed monthly, Cancel any time.
$19.99 after promotion ends
3 Month Pass
$44.99 $26.99
$8.99/mo
One time purchase of $26.99, Does not auto-renew.
$44.99 after promotion ends
Save $18!
MOST POPULAR
Annual Pass
$119.99 $71.99
$5.99/mo
One time purchase of $71.99, Does not auto-renew.
$119.99 after promotion ends
Save $48!
BEST DEAL
Lifetime Pass
$189.99 $113.99
One time purchase, Good for life.
Save $76!
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .