A retailer runs a Dataflow streaming pipeline that consumes customer events from Pub/Sub and enriches them with product data stored in BigQuery. The analytics team now needs to add near-real-time inventory information coming from an on-premises Oracle 12c database. The solution must capture change data without installing agents on the source database server to avoid performance impact, deliver the change stream to the existing Dataflow job with seconds-level latency, and minimize custom code and ongoing operational overhead. What should you do?
Use Database Migration Service in continuous replication mode to export Oracle change logs to Cloud Storage and schedule a nightly Dataflow batch job that loads incremental files into BigQuery.
Set up BigQuery connection federation to the on-premises Oracle instance and reference the external tables directly from the existing Dataflow SQL transform.
Deploy an open-source Debezium cluster on Google Kubernetes Engine to read Oracle changes and write them to a Kafka cluster; create a new Kafka I/O transform in Dataflow to consume the data.
Configure Datastream to capture Oracle redo logs, route the stream through the Datastream-to-Pub/Sub template, and add the new Pub/Sub subscription to the existing Dataflow pipeline for real-time enrichment.
Datastream is Google Cloud's serverless change data capture (CDC) service that can continuously read Oracle redo logs without agents, imposing minimal load on the source. It writes change events to Cloud Storage, and the Datastream-to-Pub/Sub Dataflow template streams every event to a Pub/Sub topic. By adding this topic as an additional input to the existing Dataflow pipeline, you meet the latency, agentless, and low-operations requirements. Database Migration Service does not stream Oracle changes into Pub/Sub, Debezium on GKE introduces significant operational burden, and BigQuery federation does not provide streaming CDC, so these alternatives do not satisfy the constraints.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Datastream in GCP, and why is it used in this solution?
Open an interactive chat with Bash
How does the Datastream-to-Pub/Sub template work?
Open an interactive chat with Bash
What are Oracle redo logs, and why are they important for CDC?
Open an interactive chat with Bash
What is Datastream in Google Cloud, and how does it work?
Open an interactive chat with Bash
What are Oracle redo logs, and why are they used for change data capture?
Open an interactive chat with Bash
How does the Datastream-to-Pub/Sub Dataflow template enhance streaming pipelines?
Open an interactive chat with Bash
What is Datastream in Google Cloud?
Open an interactive chat with Bash
How does Datastream-to-Pub/Sub integration work?
Open an interactive chat with Bash
What are Oracle redo logs and why are they important for CDC?
Open an interactive chat with Bash
GCP Professional Data Engineer
Ingesting and processing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99 $11.99
$11.99/mo
Billed monthly, Cancel any time.
$19.99 after promotion ends
3 Month Pass
$44.99 $26.99
$8.99/mo
One time purchase of $26.99, Does not auto-renew.
$44.99 after promotion ends
Save $18!
MOST POPULAR
Annual Pass
$119.99 $71.99
$5.99/mo
One time purchase of $71.99, Does not auto-renew.
$119.99 after promotion ends
Save $48!
BEST DEAL
Lifetime Pass
$189.99 $113.99
One time purchase, Good for life.
Save $76!
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .