You are building a churn-propensity model in BigQuery ML. The training table contains a numeric column named total_spend that ranges from a few cents to several thousand US dollars, and its distribution is extremely skewed. Business analysts want the model to treat spend as four ordered categories-"< 25", "25-100", "100-500", and ">= 500"-so that coefficients are learned per range and the same transformation is applied when the model is used for prediction. Inside the CREATE MODEL statement you plan to express this logic in a TRANSFORM clause. Which BigQuery ML manual preprocessing function should you use to implement the required transformation?
Apply ML.ROBUST_SCALER() to normalize total_spend using its interquartile range.
Apply ML.BUCKETIZE() with the split points in the TRANSFORM clause.
Apply ML.MAX_ABS_SCALER() to rescale total_spend between -1 and 1 before training.
Apply ML.FEATURE_CROSS() to create four spend category indicators from total_spend.
The requirement is to convert a continuous numeric feature into a small set of discrete ranges defined by explicit numeric boundaries and to have BigQuery ML remember and re-apply that mapping at prediction time. The ML.BUCKETIZE transformation is designed for this purpose: it takes a numeric input and assigns each value to a bucket ID based on user-supplied split points, storing the mapping inside the model's preprocessing graph. Scaling functions such as ML.MAX_ABS_SCALER leave the feature continuous, ML.ROBUST_SCALER rescales by the interquartile range, and ML.FEATURE_CROSS can only combine already categorical features but does not create buckets. Therefore, ML.BUCKETIZE is the correct choice.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
How does ML.BUCKETIZE work in BigQuery ML?
Open an interactive chat with Bash
What is the difference between ML.BUCKETIZE and ML.FEATURE_CROSS?
Open an interactive chat with Bash
When should I use ML.MAX_ABS_SCALER or ML.ROBUST_SCALER instead of ML.BUCKETIZE?
Open an interactive chat with Bash
What does ML.BUCKETIZE() do in BigQuery ML?
Open an interactive chat with Bash
How is ML.BUCKETIZE() different from scaling techniques like ML.MAX_ABS_SCALER?
Open an interactive chat with Bash
When should ML.ROBUST_SCALER and ML.FEATURE_CROSS be used instead of ML.BUCKETIZE?
Open an interactive chat with Bash
GCP Professional Data Engineer
Preparing and using data for analysis
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .