GCP Professional Data Engineer Practice Question

You are asked to build an end-to-end retrieval-augmented generation (RAG) solution that stays entirely inside BigQuery. Product-support articles are already ingested into a table customer_docs(doc_id STRING, text STRING). You must (1) turn every text row into a vector, (2) store the vectors for fast similarity search, and (3) retrieve the most relevant passages at runtime when a user submits a question. Which implementation satisfies all three requirements with minimal custom infrastructure?

Train an AutoML text classification model with CREATE MODEL, use ML.PREDICT to label documents, and return articles whose label matches the intent predicted for the user's question.
Apply ML.FEATURE_CROSS and ML.NORMALIZER to transform the text column, then build a materialized view and join on cosine distance calculated in SQL whenever a user asks a question.
Create a new table SELECT doc_id, ML.GENERATE_EMBEDDING(MODEL textembedding-gecko, text) AS embedding FROM customer_docs; store the ARRAY column. At query time embed the user prompt with the same function and call VECTOR_SEARCH over the table to return the top K matching rows.
Export customer_docs to Cloud Storage as JSON, invoke a Vertex AI batch prediction job to create embeddings, load the output back as an external BigLake table, and query it with standard equality filters.

GCP Professional Data Engineer

Preparing and using data for analysis

Your Score:

Bash, the Crucial Exams Chat Bot

AI Bot

GCP Professional Data Engineer Practice Question

Answer Description

Ask Bash

What is ML.GENERATE_EMBEDDING and how does it work in BigQuery?

What is VECTOR_SEARCH and how is it used for approximate-nearest-neighbor retrieval in BigQuery?

Why use embeddings and VECTOR_SEARCH in BigQuery instead of external tools or infrastructure?

ELI5: What are embeddings in machine learning?

What is VECTOR_SEARCH in BigQuery?

How does ML.GENERATE_EMBEDDING work in this RAG solution?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

GCP Professional Data Engineer Practice Question

Report Issue

Answer Description

Ask Bash

What is ML.GENERATE_EMBEDDING and how does it work in BigQuery?

What is VECTOR_SEARCH and how is it used for approximate-nearest-neighbor retrieval in BigQuery?

Why use embeddings and VECTOR_SEARCH in BigQuery instead of external tools or infrastructure?

ELI5: What are embeddings in machine learning?

What is VECTOR_SEARCH in BigQuery?

How does ML.GENERATE_EMBEDDING work in this RAG solution?

Report Issue