A financial services company has been successfully using a machine learning model for two years to predict loan default risk. The model was trained on historical data and uses features like credit score, income level, debt-to-income ratio, and loan purpose. Recently, the model's performance has degraded significantly. An investigation reveals that while the distributions of the input features for new loan applicants remain statistically consistent with the training data, the economic climate has shifted. A new government policy has been implemented, offering interest rate relief to certain borrowers, which has changed the underlying factors that lead to loan defaults. Which of the following best describes the primary cause for the model's declining accuracy?
The correct answer is concept drift. Concept drift occurs when the statistical properties of the target variable change over time, meaning the relationship between the input features and the target variable has evolved. In this scenario, the new government policy altered what it means to be at risk of default, changing the fundamental relationship (the "concept") the model originally learned, even though the input data distributions (applicant profiles) have not changed.
Data drift, or covariate drift, is incorrect because it is defined by a change in the statistical distribution of the input features, which the scenario explicitly states have remained consistent.
Overfitting is incorrect because it is a modeling error where the model learns the training data too well, including its noise, and fails to generalize to new data from the start. The model's successful performance for two years makes overfitting an unlikely primary cause for the recent, sudden degradation.
Data leakage is incorrect. Data leakage happens during model development when the training data contains information that would not be available at prediction time, often leading to artificially inflated performance metrics. This does not describe a model that performs well for a period and then degrades due to external changes.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is concept drift in machine learning?
Open an interactive chat with Bash
How does concept drift differ from data drift?
Open an interactive chat with Bash
How can machine learning models be adapted to handle concept drift?