Your data engineering team needs to stream change data capture (CDC) events from an on-premises MySQL 8.0 database into BigQuery with end-to-end latency under two minutes. The solution must
require minimal custom code and operational overhead,
automatically propagate source-table schema changes to BigQuery, and
allow the future onboarding of an additional PostgreSQL source without major re-engineering. Which approach best meets these requirements?
Deploy a self-managed Debezium connector that streams MySQL binlogs into Pub/Sub topics and trigger Cloud Functions to write each message to BigQuery.
Configure Datastream to capture MySQL binary log events into a Cloud Storage staging bucket and invoke the Google-provided Datastream-to-BigQuery Dataflow template for continuous loading into BigQuery.
Set up asynchronous replication from the on-premises MySQL instance to Cloud SQL, then query the replica directly from BigQuery using federated queries refreshed every hour.
Schedule daily exports from MySQL using BigQuery Data Transfer Service to load CSV files from Cloud Storage into BigQuery tables.
Google Cloud Datastream provides a fully managed, serverless way to capture MySQL and PostgreSQL change events. By configuring Datastream to read the MySQL 8.0 binary log (using GTID-based replication) and write the events to a Cloud Storage staging bucket, you can then launch the Google-provided "Datastream to BigQuery" Dataflow template. This managed template continuously loads the change stream into BigQuery, handles schema evolution, and requires no custom Dataflow code or infrastructure management. Onboarding a future PostgreSQL source only entails creating an additional Datastream stream that targets the same BigQuery dataset. In contrast:
Running a self-managed Debezium connector on Pub/Sub introduces significant operational burden.
Daily BigQuery Data Transfer Service loads are batch-oriented and violate the two-minute latency goal.
Cloud SQL replication plus hourly federated queries provides batch latency and adds administrative overhead. Therefore, the Datastream plus managed template approach best satisfies the stated constraints.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is change data capture (CDC) in databases?
Open an interactive chat with Bash
How does Datastream manage schema changes automatically?
Open an interactive chat with Bash
What advantages does Datastream offer compared to self-managed solutions like Debezium?
Open an interactive chat with Bash
What is Change Data Capture (CDC) and how does it work?
Open an interactive chat with Bash
How does Google Cloud Datastream handle schema evolution?
Open an interactive chat with Bash
Why is Datastream preferred over Debezium in this use case?
Open an interactive chat with Bash
GCP Professional Data Engineer
Ingesting and processing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .