A data science team has developed an initial fraud detection model for a financial services company using a complex Gradient Boosting Machine (GBM) on a large, high-dimensional dataset. While the model achieves a high F1-score of 0.92 on the test set, its average inference time is 200ms, which fails to meet the critical business requirement of sub-50ms latency for real-time transaction processing. The team is also constrained by a fixed cloud computing budget for model deployment. Given these constraints, which of the following actions represents the most effective next step in the model design iteration process to address the latency issue while managing performance trade-offs?
Initiate a new set of experiments focused on hyperparameter tuning, specifically reducing the max_depth and n_estimators of the GBM, while systematically tracking the impact on both F1-score and inference time.
Perform aggressive feature selection to reduce the number of input features by 75% and then retrain the original GBM model without altering its hyperparameters.
Replace the GBM with a simpler, inherently faster model like Logistic Regression and retrain it on the full dataset, as this is the only way to guarantee meeting the latency requirement.
Deploy the model on a more powerful and expensive GPU-based inference server to decrease the latency, requesting an increase in the cloud computing budget.
The correct action is to initiate a new set of experiments focused on hyperparameter tuning. The core issue is the model's complexity leading to high latency, which directly conflicts with business requirements. The model design iteration process involves systematically refining a model to meet constraints. Hyperparameters like max_depth and n_estimators in a GBM directly control model complexity; reducing them makes the model smaller and faster to execute, though it may slightly lower the F1-score. This approach directly addresses the accuracy-versus-speed trade-off in a controlled, iterative manner, which is a key part of model design.
Incorrect options explained:
Replacing the GBM with a much simpler model like Logistic Regression is a drastic step that is likely to cause an unacceptable drop in the F1-score. It is not the best next step, as tuning the current, high-performing architecture should be attempted first.
Aggressively reducing features by a fixed, arbitrary amount (75%) without also tuning the model's parameters is a high-risk and less systematic approach. While feature selection is a valid technique, this specific action is not as controlled as hyperparameter tuning.
Requesting a budget increase to deploy on more powerful hardware violates the stated constraint of a "fixed cloud computing budget" and is often not the first or most cost-effective solution in the iteration process.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Gradient Boosting Machine (GBM)?
Open an interactive chat with Bash
Why does reducing `max_depth` and `n_estimators` help with latency?
Open an interactive chat with Bash
What is the F1-score, and why is it important here?