GCP Professional Data Engineer Practice Question

A retail media team ingests product descriptions, reviews, and support-chat transcripts into BigQuery. They need to power a retrieval-augmented generation (RAG) service that answers natural-language questions about product issues. Design goals include:

  • The LLM must receive only the top 10 semantically closest text chunks, each ≤ 2 KB.
  • Embeddings should use a managed, up-to-date model without exporting data outside Google Cloud.
  • Monthly ingestion adds 20 million new chunks; query latency must stay below 500 ms.

Which architecture best meets the goals while minimizing operational overhead?

  • Generate embeddings with a custom TensorFlow container on Vertex AI Pipelines, store them in Firestore, and run similarity queries with a Cloud Run microservice that implements HNSW.

  • Use ML.GENERATE_EMBEDDING to write vectors into a BigQuery table clustered on the embedding column, and issue VECTOR_SEARCH queries at read time to retrieve the 10 nearest chunks for the prompt sent to the LLM.

  • Export the text as JSON to Cloud Storage, use the open-source FAISS library on a GKE Autopilot cluster for indexing and ANN search, and stream the results back into BigQuery before calling the LLM.

  • Store the text in a Bigtable row for each chunk, use Dataproc Serverless with Spark MLlib to build word2vec embeddings, and push the top 10 matches into a BI Engine cache that the LLM queries.

GCP Professional Data Engineer
Preparing and using data for analysis
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot