CompTIA DataX DY0-001 (V1) Practice Question

You are building an LDA-based topic exploration dashboard for a corpus of 50 000 product reviews. During hyper-parameter tuning you train several models with 20-150 topics. Perplexity on a held-out validation set keeps decreasing as more topics are added, yet domain experts say the extra topics become redundant and semantically confusing beyond roughly 80. Which additional quantitative metric should you add to the tuning pipeline so that the automatically selected model better tracks human interpretability of the topics?

Measure the BLEU score between the model's predicted words and the original review sentences.
Rely solely on the validation perplexity because lower perplexity always implies more interpretable topics.
Compute a topic coherence score (e.g., c_v or NPMI) on the top words of each model and choose the model that maximizes it.
Use the silhouette coefficient of k-means clusters built from TF-IDF document vectors to pick the best topic count.

CompTIA DataX DY0-001 (V1)

Specialized Applications of Data Science

Your Score:

SAVE $64

CompTIA DataX Voucher

v1 / DY0-001

$529.00 $465.00

Bash, the Crucial Exams Chat Bot

AI Bot

CompTIA DataX DY0-001 (V1) Practice Question

Answer Description

Ask Bash

What is perplexity and why does it decrease as more topics are added in LDA models?

What is a topic coherence score, and how does it differ from perplexity?

Why are metrics like BLEU or silhouette coefficient not suitable for evaluating LDA topics?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

CompTIA DataX DY0-001 (V1) Practice Question

Report Issue

Answer Description

Ask Bash

What is perplexity and why does it decrease as more topics are added in LDA models?

What is a topic coherence score, and how does it differ from perplexity?

Why are metrics like BLEU or silhouette coefficient not suitable for evaluating LDA topics?

Report Issue