A data science team has developed two models to power a real-time product recommendation engine for a high-traffic e-commerce platform. The primary business need is to increase the average order value (AOV) by at least 5% while ensuring system inference latency remains below 150ms to maintain a seamless user experience. During project discussions, key stakeholders also expressed a want for the company to be perceived as an industry innovator by using state-of-the-art AI.
The team has concluded testing on two final models:
Model A (Deep Learning): Achieves the highest offline precision and recall scores. However, its average inference latency is 300ms on the target production hardware and requires expensive GPU instances, which will significantly increase operational expenditure.
Model B (Matrix Factorization): Achieves offline precision and recall scores that are 4% lower than Model A. Its average inference latency is 50ms on standard CPU instances, keeping operational costs low.
Both models are projected to increase AOV by more than the required 5%. Given these results, which action best demonstrates the ability to differentiate between business needs, wants, and the reality of the experimental results?
Recommend deploying Model A, arguing that its superior offline accuracy is more likely to drive AOV and aligns with the stakeholder's desire for cutting-edge technology.
Recommend deploying Model B, justifying the choice by explaining that it successfully meets all critical business needs, including the sub-150ms latency requirement, which is essential for user experience and achieving the AOV goal.
Recommend a live A/B test of both models to gather real-world performance data before making a final decision, letting the results determine which model is superior.
Request additional time and resources to attempt to reduce Model A's latency and operational cost before recommending a solution to the stakeholders.
The correct action is to recommend the deployment of Model B. This choice correctly prioritizes the critical business 'need' over the stakeholder 'want'. The scenario establishes a strict latency requirement of under 150ms, which is a core need for maintaining a positive user experience and, by extension, achieving conversion-related goals like increasing AOV. Model A, despite its higher offline accuracy, fails to meet this critical requirement, with a latency of 300ms. Model B meets both the AOV projection and the critical latency requirement. An expert data scientist understands that a model that harms user experience due to high latency is likely to fail in production, regardless of its offline accuracy.
Recommending Model A prioritizes the stakeholder 'want' for advanced technology and the single metric of offline accuracy while ignoring the reality that it violates a key business 'need' for low latency. Suggesting an A/B test is a valid step, but it is not the best initial recommendation because Model A already fails a hard constraint (latency > 150ms), making it an unsuitable candidate for a live production test that could negatively impact users. Requesting more resources to optimize Model A defers the decision and fails to provide a recommendation based on the current, conclusive test results which already show a viable option (Model B) is available.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why is latency important in a real-time recommendation engine?
Open an interactive chat with Bash
What makes offline accuracy less critical than real-time performance in this scenario?
Open an interactive chat with Bash
Why wasn't an A/B test the best choice in this situation?