A machine learning engineer is developing a regression model using a gradient boosting algorithm for a task with known non-linear patterns. To mitigate potential overfitting, the engineer configured the model with a high L2 regularization (lambda) value and constrained the tree max_depth to 2. The performance metrics are as follows:
Training Set Root Mean Squared Error (RMSE): 148.5
Validation Set Root Mean Squared Error (RMSE): 150.1
A simple baseline model that predicts the mean of the target variable has an RMSE of 155.0 on the same validation set. The engineer observes that the model's performance on both the training and validation sets is poor and only marginally better than the baseline. This indicates that the model is not capturing the underlying structure of the data. Which of the following is the most effective strategy to address this specific issue?
Implement k-fold cross-validation and increase the L2 regularization parameter.
Increase the max_depth of the trees and decrease the L2 regularization parameter.
Apply the Synthetic Minority Oversampling Technique (SMOTE) to the training data.
Add more features to the dataset while keeping the current max_depth and regularization.
The correct answer is to increase the max_depth of the trees and decrease the L2 regularization parameter. The scenario describes a classic case of underfitting, which is characterized by high bias. The model performs poorly on both the training and validation sets, and the error rates are very close to each other. This suggests the model is too simple to capture the complex, non-linear patterns in the data. The high regularization value and very low max_depth are excessively constraining the model's complexity. To address underfitting, the most direct approach is to increase the model's capacity to learn. Increasing max_depth allows the trees to learn more complex interactions, and decreasing the L2 regularization reduces the penalty on model complexity, both of which will help reduce bias and improve the model's ability to fit the data.
Implementing k-fold cross-validation is a good practice for robust evaluation but does not directly address the underfitting problem. Further increasing the regularization parameter would worsen the underfitting.
Applying SMOTE (Synthetic Minority Oversampling Technique) is incorrect because SMOTE is designed to handle class imbalance in classification problems, not for regression tasks.
Adding more features could potentially help, but the primary issue identified by the metrics is the overly simplistic model configuration (high regularization and low depth), which should be addressed first.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What does underfitting mean in machine learning?
Open an interactive chat with Bash
How does decreasing L2 regularization improve a model?
Open an interactive chat with Bash
Why is increasing `max_depth` important for capturing non-linear patterns?