A payment gateway must flag card-not-present fraud in real time. Legitimate transactions outnumber confirmed fraud roughly 1,000:1, and card networks can take up to 90 days to return charge-back labels. Every new purchase must be scored in less than 50ms so that checkout latency stays within the service-level agreement. Which approach to model training and production monitoring BEST satisfies these constraints while reducing both missed fraud and needless customer friction?
Deploy an online or mini-batch ensemble that applies class-weighted loss to give extra importance to fraud, retrains monthly on a rolling window of fully-labeled transactions, and triggers early retraining when the Population Stability Index exceeds a drift threshold.
Train an autoencoder solely on legitimate transactions, flag high reconstruction error as fraud, and ignore the delayed charge-back labels because the technique is unsupervised.
Randomly undersample 99.9% of legitimate transactions, train a gradient-boosted model once per year, and monitor overall accuracy and ROC-AUC for performance.
Use SMOTE to oversample fraud to a 1:1 ratio, fit a static logistic-regression model, and review precision-recall curves only once each quarter.
applies class-dependent weights during learning so the model does not ignore the rare but costly fraud cases,
updates itself on a rolling window of transactions once the delayed labels have matured, so new fraud patterns are learned quickly without leaking unverifiable data, and
watches a data-drift statistic such as the Population Stability Index so that it can trigger an earlier rebuild if the live feature distribution departs from the training distribution.
This combination directly tackles the three practical challenges in the scenario--extreme class imbalance, delayed ground-truth, and concept drift--while keeping the scoring path lightweight enough for sub-50-millisecond latency.
Why the other options fall short:
Random undersampling plus yearly retraining discards most legitimate data, learns obsolete patterns, and evaluates with accuracy/ROC-AUC--metrics that hide performance on rare positives.
SMOTE with a static logistic regression ignores changing fraud behavior; quarterly reviews cannot correct months of drift and SMOTE can introduce synthetic noise.
A reconstruction-error autoencoder trained only on legitimate traffic produces many false positives and, by ignoring label feedback, cannot improve once its anomaly threshold stops matching reality.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Population Stability Index (PSI) and why is it important in model monitoring?
Open an interactive chat with Bash
How does class-weighted loss help in cases of extreme class imbalance?
Open an interactive chat with Bash
What is the advantage of using a rolling window retraining approach for this scenario?