CompTIA DataX DY0-001 (V1) Practice Question

A data science team is developing a predictive model for equipment failure using a single, unpruned decision tree. During testing, they observe two phenomena:

The model achieves near-perfect accuracy on the training dataset but performs poorly on the unseen validation dataset.
Minor changes to the training data, such as removing a small number of data points, result in a drastically different tree structure and predictions.

Which underlying characteristic of decision trees is the primary cause of both of these observations?

The curse of dimensionality
Multicollinearity
High variance
High bias

Report Issue

Answer Description

The correct answer is high variance. High variance in a model means it is highly sensitive to fluctuations in the training data. This sensitivity causes two primary effects seen in unpruned decision trees. First, the model learns the training data, including its noise, too well, which leads to overfitting. This explains why the model has high accuracy on the training set but generalizes poorly to new, unseen data. Second, because the model is so closely fitted to the specific training data, even small changes to that data can lead to significant changes in the model's structure and predictions, a behavior known as instability.

High bias is incorrect. High bias refers to underfitting, where the model is too simple to capture the underlying patterns in the data. This would result in poor performance on both the training and validation sets, which contradicts the scenario.
The curse of dimensionality refers to problems that arise when working with high-dimensional data, such as data sparsity and increased computational cost. While it can impact model performance, it is not the direct cause of a model's instability and overfitting in the way high variance is.
Multicollinearity, the correlation between predictor variables, can affect the stability of a decision tree's feature selection and interpretability but is not the fundamental reason for overfitting and sensitivity to data changes. The core issue described is high variance, for which decision trees are well-known.

Ask Bash

Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.

What is high variance in machine learning models?

Open an interactive chat with Bash

Why are decision trees prone to instability?

Open an interactive chat with Bash

How can high variance in decision trees be reduced?

Open an interactive chat with Bash

CompTIA DataX DY0-001 (V1)

Machine Learning

Your Score:

SAVE $64

CompTIA DataX Voucher

v1 / DY0-001

$529.00 $465.00

Bash, the Crucial Exams Chat Bot

AI Bot

CompTIA DataX DY0-001 (V1) Practice Question

Answer Description

Ask Bash

What is high variance in machine learning models?

Why are decision trees prone to instability?

How can high variance in decision trees be reduced?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams