CompTIA DataX DY0-001 (V1) Practice Question

A data scientist is tasked with building a multi-class classification model to categorize customer support tickets into 10 distinct types. The dataset is highly imbalanced; some ticket types represent over 40% of the data, while three critical but rare types each account for less than 1%. The primary business requirement is to ensure the model performs well across all categories, giving equal importance to both common and rare ticket types. Given this specific requirement, which statistical metric is the most appropriate for evaluating model performance during design iterations?

R-squared
Overall Accuracy
Macro-Averaged F1-Score
Micro-Averaged F1-Score

Report Issue

Answer Description

The correct answer is Macro-Averaged F1-Score. In a multi-class classification scenario with imbalanced data, the choice of metric is critical. The business requirement is to treat all classes, including rare ones, with equal importance.

Macro-Averaged F1-Score: This metric calculates the F1-score for each class independently and then computes their unweighted average. By doing so, it treats all classes equally, regardless of their size. This directly addresses the business need to evaluate performance on rare but critical categories fairly.
Micro-Averaged F1-Score: This metric aggregates the counts of true positives, false negatives, and false positives across all classes before calculating a single F1-score. In an imbalanced dataset, this score will be dominated by the performance on the majority classes. For single-label, multi-class problems, the micro-F1 score is equivalent to overall accuracy.
Overall Accuracy: This is the ratio of correct predictions to the total number of predictions. It is not suitable for imbalanced datasets because a model can achieve a high accuracy score by simply predicting the majority class, while failing completely on minority classes.
R-squared: Also known as the coefficient of determination, R-squared is a metric used to evaluate the performance of regression models, not classification models. It measures the proportion of the variance in the dependent variable that is predictable from the independent variable(s).

Ask Bash

Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.

Why is the Macro-Averaged F1-Score preferred for imbalanced datasets?

Open an interactive chat with Bash

How does Micro-Averaged F1-Score differ from Macro-Averaged F1-Score?

Open an interactive chat with Bash

Why is Overall Accuracy not ideal for evaluating imbalanced datasets?

Open an interactive chat with Bash

CompTIA DataX DY0-001 (V1)

Modeling, Analysis, and Outcomes

Your Score:

SAVE $64

CompTIA DataX Voucher

v1 / DY0-001

$529.00 $465.00

Bash, the Crucial Exams Chat Bot

AI Bot

CompTIA DataX DY0-001 (V1) Practice Question

Answer Description

Ask Bash

Why is the Macro-Averaged F1-Score preferred for imbalanced datasets?

How does Micro-Averaged F1-Score differ from Macro-Averaged F1-Score?

Why is Overall Accuracy not ideal for evaluating imbalanced datasets?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams