🔥 40% Off Crucial Exams Memberships — Deal ends today!

2 hours, 31 minutes remaining!

GCP Professional Data Engineer Practice Question

Your organization runs a high-throughput Apache Kafka cluster in an on-premises data center. Security policy forbids any outbound connections from the data center; only clients in Google Cloud may initiate TLS-encrypted, private (VPN/Interconnect) connections to on-prem resources. You must build a near-real-time pipeline that ingests the Kafka messages into BigQuery with minimal additional operational overhead. Which approach should you choose for the streaming source of the pipeline?

  • Configure Kafka Connect to batch-export topic data to Cloud Storage every minute, then run a scheduled Dataflow batch job to load the files into BigQuery.

  • Refactor the on-prem producers to send events directly to Cloud Pub/Sub over the internet and have Dataflow read from the Pub/Sub subscription.

  • Run a serverless Dataflow streaming job that uses the Apache Kafka I/O connector to consume the on-prem Kafka topic over the private VPN and stream the data into BigQuery.

  • Deploy Pub/Sub Lite on-premises with Pub/Sub Edge, publish the Kafka messages into the new topic, and have Dataflow read from Pub/Sub Lite.

GCP Professional Data Engineer
Ingesting and processing the data
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot