CompTIA DataX DY0-001 (V1) Practice Question

A data scientist for a global logistics company is tasked with enriching a dataset of 5 million customer records. The dataset includes a full_address field, which is a free-text entry containing significant variations in formatting, abbreviations, and occasional errors. The primary goal is to append precise latitude and longitude coordinates to each record for a critical territory optimization and route planning model. The project must be cost-effective and completed within a strict two-week timeline. Which of the following geocoding strategies is the MOST effective and robust for this scenario?

  • Bypass coordinate-level geocoding to save time and cost. Instead, extract city and state information from the address strings and apply one-hot encoding to these categorical features for use in the optimization model.

  • Develop a sequential pipeline that first standardizes address strings using parsing libraries. Next, use a high-throughput, paid batch geocoding service that supports asynchronous processing for the bulk of the data. For any addresses that fail to geocode, implement a fallback to a second geocoding provider to maximize coverage.

  • Directly stream all 5 million raw address strings into a real-time, pay-per-query geocoding API. Monitor the process and manually retry any failures until all records are processed to ensure completeness.

  • Set up an internal geocoding server using open-source software like Nominatim with OpenStreetMap data. After importing the relevant map data, process the 5 million addresses internally to avoid external API costs and rate limits.

CompTIA DataX DY0-001 (V1)
Modeling, Analysis, and Outcomes
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot