GCP Professional Data Engineer Practice Question

Your analytics team stores 50 GB of transaction data in BigQuery and wants to train a logistic-regression churn model with BigQuery ML. One input column, customer_segment, contains string values such as "Retail", "SMB", and "Enterprise". The model must serve predictions through ML.PREDICT without any additional preprocessing code. Inside the TRANSFORM clause of the CREATE MODEL statement, which expression should you use to convert customer_segment into a set of binary indicator features that the model can consume at both training and prediction time?

  • ML.FEATURE_CROSS([customer_segment]) AS customer_segment_features

  • SAFE_CAST(customer_segment AS INT64) AS customer_segment_features

  • ML.BUCKETIZE(customer_segment, ['Retail','SMB','Enterprise']) AS customer_segment_features

  • ML.ONE_HOT_ENCODER(customer_segment) AS customer_segment_features

GCP Professional Data Engineer
Preparing and using data for analysis
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot