CompTIA DataX DY0-001 (V1) Practice Question

You are building an LDA-based topic exploration dashboard for a corpus of 50 000 product reviews. During hyper-parameter tuning you train several models with 20-150 topics. Perplexity on a held-out validation set keeps decreasing as more topics are added, yet domain experts say the extra topics become redundant and semantically confusing beyond roughly 80. Which additional quantitative metric should you add to the tuning pipeline so that the automatically selected model better tracks human interpretability of the topics?

  • Measure the BLEU score between the model's predicted words and the original review sentences.

  • Use the silhouette coefficient of k-means clusters built from TF-IDF document vectors to pick the best topic count.

  • Compute a topic coherence score (e.g., c_v or NPMI) on the top words of each model and choose the model that maximizes it.

  • Rely solely on the validation perplexity because lower perplexity always implies more interpretable topics.

CompTIA DataX DY0-001 (V1)
Specialized Applications of Data Science
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot