Your analytics team stores 50 GB of transaction data in BigQuery and wants to train a logistic-regression churn model with BigQuery ML. One input column, customer_segment, contains string values such as "Retail", "SMB", and "Enterprise". The model must serve predictions through ML.PREDICT without any additional preprocessing code. Inside the TRANSFORM clause of the CREATE MODEL statement, which expression should you use to convert customer_segment into a set of binary indicator features that the model can consume at both training and prediction time?
ML.FEATURE_CROSS([customer_segment]) AS customer_segment_features
SAFE_CAST(customer_segment AS INT64) AS customer_segment_features
ML.BUCKETIZE(customer_segment, ['Retail','SMB','Enterprise']) AS customer_segment_features
ML.ONE_HOT_ENCODER(customer_segment) AS customer_segment_features
ML.ONE_HOT_ENCODER is BigQuery ML's built-in encoding function for categorical string or integer columns. When you apply ML.ONE_HOT_ENCODER(customer_segment), BigQuery ML automatically expands the source column into a set of Boolean indicator columns (one per observed category) and stores the transformation with the model. At prediction time ML.PREDICT executes the same transformation on incoming data, eliminating the need for external preprocessing and preventing training-serving skew.
ML.BUCKETIZE is used for numeric bucketing, not categorical encoding. Casting the string to INT64 does not create meaningful features. ML.FEATURE_CROSS creates hashed interaction features and is not intended for simple one-hot encoding. Therefore, ML.ONE_HOT_ENCODER is the only option that meets the requirement.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is one-hot encoding in machine learning?
Open an interactive chat with Bash
What is ML.PREDICT in BigQuery ML?
Open an interactive chat with Bash
What is the TRANSFORM clause in BigQuery ML?
Open an interactive chat with Bash
What is ML.ONE_HOT_ENCODER in BigQuery ML?
Open an interactive chat with Bash
How does ML.ONE_HOT_ENCODER prevent training-serving skew?
Open an interactive chat with Bash
How does one-hot encoding differ from ML.BUCKETIZE?
Open an interactive chat with Bash
GCP Professional Data Engineer
Preparing and using data for analysis
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .