An organization is creating a predictive model. During data collection, hidden entries are introduced that degrade accuracy. Which measure best addresses this situation?
Disconnect the dataset from the network to reduce unauthorized access after collection.
Perform data verification and anomaly detection prior to model training to detect manipulated inputs.
Exclude identified records that are outdated or incomplete from model training.
Assign model validation to a team member with relevant expertise.
Data verification and anomaly detection help uncover subtle manipulations in training datasets before they can affect model accuracy. This process involves scanning for patterns, inconsistencies, or unexpected values that may indicate tampering. Simply assigning validation to one person, removing outdated entries, or isolating the dataset does not address hidden data poisoning. Proactively checking for anomalies strengthens the model's integrity and ensures cleaner, more trustworthy inputs for training.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is data verification in the context of predictive modeling?
Open an interactive chat with Bash
What is anomaly detection and how is it used in data preprocessing?
Open an interactive chat with Bash
Why is anomaly detection prioritized over merely excluding outdated or incomplete records?