CompTIA DataX DY0-001 (V1) Practice Question

A data-science team must deploy an automatic summarization service for customer-support incident tickets. The business and compliance rules require that every summary must be no longer than 30 words and must paraphrase the ticket instead of copying sentences to prevent exposing sensitive customer information. The team has about 100,000 English ticket-to-summary pairs produced by analysts available for supervised learning, and the company can run models on a GPU cluster. Within the next quarter, the summaries must also work for Spanish and French with minimal additional annotation effort. Finally, during system acceptance, reviewers will value semantic equivalence with analyst summaries more than exact n-gram overlap.

Which combination of modeling approach and primary automatic evaluation metric best satisfies these requirements?

  • Use a simple Lead-3 extraction baseline and report the compression ratio of tokens as the main metric.

  • Fine-tune a pre-trained multilingual encoder-decoder Transformer (e.g., mBART/mT5) for abstractive summarization and evaluate with BERTScore.

  • Train a skip-gram Word2Vec model to identify key phrases and measure precision-recall of the extracted phrases.

  • Apply the unsupervised TextRank algorithm to extract top-ranked sentences and evaluate with ROUGE-1.

CompTIA DataX DY0-001 (V1)
Specialized Applications of Data Science
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot