CompTIA DataX DY0-001 (V1) Practice Question

Your team is merging two patient-registration systems for a nationwide health provider. System A contains 1.2 million records and System B contains 900 000. Each record stores Social Security Number (SSN), first name, last name, date of birth, and ZIP code, but about 15 percent of the SSNs in System B are missing. A deterministic inner join on SSN has matched 70 percent of System B's records. The business goal is to increase the overall match rate while keeping the false-match rate below 1 percent and avoiding a full Cartesian comparison of the two files. Which data-matching strategy is most appropriate?

  • Accept only the SSN matches and treat all unmatched rows as new patients to avoid introducing false links.

  • Create Soundex codes for first and last names and run a fuzzy join on every remaining record pair without using any blocking strategy.

  • Perform a second deterministic join that keeps the SSN match and additionally links any records whose first three name characters match exactly.

  • Use a probabilistic Fellegi-Sunter linkage that first blocks on last name and birth year, then applies Jaro-Winkler similarity on names and exact or near-exact comparisons on date of birth and ZIP code to compute match weights.

CompTIA DataX DY0-001 (V1)
Operations and Processes
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot