GCP Professional Data Engineer Practice Question

You are asked to build an end-to-end retrieval-augmented generation (RAG) solution that stays entirely inside BigQuery. Product-support articles are already ingested into a table customer_docs(doc_id STRING, text STRING). You must (1) turn every text row into a vector, (2) store the vectors for fast similarity search, and (3) retrieve the most relevant passages at runtime when a user submits a question. Which implementation satisfies all three requirements with minimal custom infrastructure?

  • Train an AutoML text classification model with CREATE MODEL, use ML.PREDICT to label documents, and return articles whose label matches the intent predicted for the user's question.

  • Apply ML.FEATURE_CROSS and ML.NORMALIZER to transform the text column, then build a materialized view and join on cosine distance calculated in SQL whenever a user asks a question.

  • Create a new table SELECT doc_id, ML.GENERATE_EMBEDDING(MODEL textembedding-gecko, text) AS embedding FROM customer_docs; store the ARRAY column. At query time embed the user prompt with the same function and call VECTOR_SEARCH over the table to return the top K matching rows.

  • Export customer_docs to Cloud Storage as JSON, invoke a Vertex AI batch prediction job to create embeddings, load the output back as an external BigLake table, and query it with standard equality filters.

GCP Professional Data Engineer
Preparing and using data for analysis
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot