A data science team trains an XGBoost model to predict loan default. The library's default feature-importance plot, which uses the gain metric, ranks the variable Customer_ID highest, while Age appears near the bottom. When the team computes permutation importance on a held-out validation set, Age rises to the top and Customer_ID drops sharply. Which explanation best accounts for the conflicting importance rankings?
The conflict arises because permutation importance for classification relies on the Gini impurity formula used in regression trees, which is incompatible with XGBoost models.
Permutation importance is calculated only on the training data, so it undervalues features that generalize well and makes Customer_ID look weaker than it really is.
Gain importance ignores how frequently a feature is selected for splitting, so variables like Age that create large gains only a few times are hidden from the ranking.
Gain importance tends to inflate the score of features that have many unique values or potential split points, such as an identifier; permutation importance measures the drop in validation performance and is therefore much less affected by this cardinality bias.
Gain-based importance is computed from how much each split on a feature reduces the training loss. Because features that offer many possible split points (high-cardinality IDs, long numeric scales, etc.) can appear in a tree more profitably, the gain metric systematically inflates their contribution-even when they add little out-of-sample predictive power. Permutation importance, on the other hand, measures the decrease in model performance when a feature's values are shuffled on a separate validation set, so it reflects a feature's true generalization value and is far less sensitive to cardinality. Permutation therefore downgrades Customer_ID (an almost pure-ID field) and upgrades Age, while the gain metric exhibits the opposite bias. The other options misstate how the two methods are computed or confuse regression and classification impurity measures.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why does gain importance favor features with many unique values?
Open an interactive chat with Bash
How does permutation importance reflect generalization value?
Open an interactive chat with Bash
What is the main difference in how gain and permutation importance are computed?