🔥 40% Off Crucial Exams Memberships — Deal ends today!

2 hours, 31 minutes remaining!

GCP Professional Data Engineer Practice Question

Your team runs a Python Dataflow streaming pipeline that ingests 20 000 JSON events per second from Pub/Sub and writes enriched rows to BigQuery. Each event must be classified by an existing Vertex AI model, and the end-to-end latency budget is 200 ms even during peak load. The solution must stay fully serverless and keep prediction cost as low as possible. How should you integrate the inference step into the pipeline?

  • Write events to Cloud Storage and launch a Vertex AI batch prediction job every minute, then read the output back into the streaming pipeline.

  • Use a GroupIntoBatches transform to assemble small bundles of events and send each bundle as a single gRPC request to the Vertex AI online prediction endpoint from the Dataflow worker.

  • Stream events directly into BigQuery and execute a scheduled BigQuery ML remote-model query every 30 seconds to populate the classification column.

  • Invoke the Vertex AI online prediction endpoint synchronously for each individual event inside a MapElements transform.

GCP Professional Data Engineer
Ingesting and processing the data
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot