Your team must plan a pipeline for a global e-commerce platform. Mobile apps produce 50 000 events/sec that must power millisecond-latency personalization and be available for ad-hoc analytics in BigQuery. Compliance demands customer-managed encryption for stored data, no public IP addresses on processing workers, and a perimeter that blocks data exfiltration. Which architecture meets every requirement while minimising operational overhead?
Publish events to Pub/Sub; process them with streaming Dataflow jobs that use CMEK, disable public IPs, and run inside a VPC Service Controls perimeter; have the pipeline write in parallel to Bigtable for low-latency serving and BigQuery for analytics.
Use Cloud Data Fusion real-time pipelines to read from Pub/Sub and write to Spanner for both analytics and serving; rely on Cloud NAT for egress and Cloud DLP for data encryption.
Stream events to a self-managed Apache Kafka cluster on Compute Engine; use Spark Streaming to land data in Cloud Storage; query it through BigQuery external tables.
Send events to Cloud Storage with default encryption; schedule Dataproc Spark batch jobs on preemptible VMs via Cloud NAT; load the results into BigQuery and Cloud SQL for serving.
The correct design streams events through Pub/Sub, a fully-managed messaging service that scales to high throughput and requires little operational work. A streaming Dataflow job can read from Pub/Sub and simultaneously write to Bigtable and BigQuery. Dataflow supports customer-managed encryption keys, can run without public IP addresses, and can be placed inside a VPC Service Controls perimeter, satisfying all security and compliance constraints. Bigtable delivers single-digit-millisecond reads for personalization, while BigQuery provides interactive analytics. The other options fail at least one requirement: the Dataproc batch design does not meet the low-latency or CMEK needs, the self-managed Kafka approach raises operational overhead and lacks built-in CMEK and perimeter controls, and the Cloud Data Fusion option stores data in Spanner (not ideal for analytics), relies on Cloud NAT (still public egress), and treats Cloud DLP as encryption rather than masking.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Pub/Sub and why is it suitable for high-throughput messaging?
Open an interactive chat with Bash
What is CMEK in Dataflow, and how does it ensure compliance?
Open an interactive chat with Bash
How does VPC Service Controls enforce a perimeter to block data exfiltration?
Open an interactive chat with Bash
What is Pub/Sub and how does it handle high-throughput events?
Open an interactive chat with Bash
How does VPC Service Controls improve security in this architecture?
Open an interactive chat with Bash
Why is Bigtable used for personalization while BigQuery is used for analytics?
Open an interactive chat with Bash
GCP Professional Data Engineer
Ingesting and processing the data
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99 $11.99
$11.99/mo
Billed monthly, Cancel any time.
$19.99 after promotion ends
3 Month Pass
$44.99 $26.99
$8.99/mo
One time purchase of $26.99, Does not auto-renew.
$44.99 after promotion ends
Save $18!
MOST POPULAR
Annual Pass
$119.99 $71.99
$5.99/mo
One time purchase of $71.99, Does not auto-renew.
$119.99 after promotion ends
Save $48!
BEST DEAL
Lifetime Pass
$189.99 $113.99
One time purchase, Good for life.
Save $76!
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .