A data analyst for an e-commerce company is monitoring the performance of a machine learning model that predicts customer purchasing behavior. For the past six months, the model's accuracy has been stable. However, in the last quarter, its predictive performance has degraded significantly. The analyst investigates and finds that the statistical distribution of incoming data for key features like 'customer age' and 'average session duration' has shifted, no longer matching the data used to train the model. Which of the following data quality assurance concepts BEST describes this issue?
The correct answer is Data drift. Data drift refers to the phenomenon where the statistical properties of the input data for a model change over time, causing the production data to differ from the data the model was trained on. This shift can lead to a degradation in model performance, as seen in the scenario where changes in customer age and session duration distributions are impacting the model's accuracy.
Requirement testing is incorrect because it is the process of validating that a system or its data outputs meet the initial business requirements. This type of testing is typically performed before deployment, not to monitor the performance of a live model in response to changing data patterns.
A unit test is a test that verifies the functionality of a specific, isolated piece of code. A unit test failure would indicate a bug in the code itself, not a change in the statistical properties of the live data flowing through the system. The scenario describes a problem with the data's characteristics, not a software bug.
Source control, also known as version control, is a system for tracking and managing changes to software code, scripts, and other files. It is not a process for monitoring the statistical properties of production data.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is data drift in machine learning?
Open an interactive chat with Bash
How can data drift be detected?
Open an interactive chat with Bash
What are the best practices for mitigating data drift?
Open an interactive chat with Bash
What causes data drift and how can it impact machine learning models?
Open an interactive chat with Bash
How is data drift different from concept drift?
Open an interactive chat with Bash
What techniques can be used to detect and address data drift?
Open an interactive chat with Bash
CompTIA Data+ DA0-002 (V2)
Data Governance
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99 $11.99
$11.99/mo
Billed monthly, Cancel any time.
$19.99 after promotion ends
3 Month Pass
$44.99 $26.99
$8.99/mo
One time purchase of $26.99, Does not auto-renew.
$44.99 after promotion ends
Save $18!
MOST POPULAR
Annual Pass
$119.99 $71.99
$5.99/mo
One time purchase of $71.99, Does not auto-renew.
$119.99 after promotion ends
Save $48!
BEST DEAL
Lifetime Pass
$189.99 $113.99
One time purchase, Good for life.
Save $76!
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .