CompTIA DataX DY0-001 (V1) Practice Question

A data science team has developed a multiclass classifier to categorize customer support inquiries into five distinct types: 'Billing', 'Technical Issue', 'Account Access', 'Product Feedback', and 'General Question'. After initial training, the model achieves 92% overall accuracy. However, a closer look at the confusion matrix reveals that the model performs very poorly on the 'Product Feedback' category, which constitutes only 3% of the dataset. The business considers this category to be of high value. Which of the following is the most effective initial step to address the model's poor performance on the minority class?

Deploy the model as is but schedule a frequent retrain as more 'Product Feedback' examples are collected.
Apply a resampling technique such as the Synthetic Minority Oversampling Technique (SMOTE) to the training data.
Change the primary evaluation metric from accuracy to a macro-averaged F1-score and re-evaluate.
Re-architect the model from a native multiclass classifier to a one-vs-rest (OvR) strategy.

Report Issue

Answer Description

The correct answer is to apply a resampling technique like the Synthetic Minority Oversampling Technique (SMOTE). The scenario describes a classic case of class imbalance, where a model performs well on majority classes but poorly on minority classes, despite high overall accuracy. SMOTE is a data-level approach that addresses this by creating new, synthetic examples of the minority class to balance the dataset. This allows the model to learn the patterns of the minority class more effectively, directly addressing the root cause of the poor performance.

Adopting a one-vs-rest (OvR) strategy is incorrect because, while it is a valid way to handle multiclass classification, it does not solve the underlying class imbalance. In an OvR scheme, each binary classifier would still face a highly imbalanced dataset (one class vs. all others), perpetuating the problem.
Switching to a macro-averaged F1-score is a good step for evaluating performance on an imbalanced dataset, as it gives equal weight to each class regardless of its frequency. However, changing the metric only helps in diagnosing the problem more accurately; it does not in itself improve the model's predictive capability on the minority class. The question asks for a step to address the poor performance, which requires a mitigation technique.
Deploying the model and scheduling a retrain after collecting more data is a passive approach that may not be feasible or timely. It ignores immediate actions that can be taken to improve the model. While more data is often helpful, waiting for it to be collected organically can be slow, and techniques like SMOTE are designed to improve performance with the existing data.

Ask Bash

Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.

How does SMOTE work in addressing class imbalance?

Open an interactive chat with Bash

Why is overall accuracy not ideal for evaluating an imbalanced dataset?

Open an interactive chat with Bash

What are the limitations of using One-vs-Rest (OvR) for imbalanced datasets?

Open an interactive chat with Bash

CompTIA DataX DY0-001 (V1)

Machine Learning

Your Score:

SAVE $64

CompTIA DataX Voucher

v1 / DY0-001

$529.00 $465.00

Bash, the Crucial Exams Chat Bot

AI Bot

CompTIA DataX DY0-001 (V1) Practice Question

Answer Description

Ask Bash

How does SMOTE work in addressing class imbalance?

Why is overall accuracy not ideal for evaluating an imbalanced dataset?

What are the limitations of using One-vs-Rest (OvR) for imbalanced datasets?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams