CompTIA DataX DY0-001 (V1) Practice Question

A machine learning team is developing a computer vision model using a 2TB dataset of high-resolution images. The team uses Git for source code versioning but is facing significant challenges with versioning the dataset itself. Standard Git is not viable due to the dataset's size, and while Git LFS was considered, the associated storage costs and performance overhead for their cloud provider are prohibitive. The team's primary requirements are to maintain lightweight, reproducible links between specific code versions and the corresponding data versions without duplicating the entire dataset for each change.

Which of the following solutions would be the MOST effective and cost-efficient for the team to implement for data versioning in this scenario?

  • Build a custom database and API to manage data versioning by storing file paths and metadata, which the team's code will query to retrieve specific dataset versions.

  • Utilize Docker to package each version of the 2TB dataset into a new container image, versioning the data and the environment together for reproducibility.

  • Implement Data Version Control (DVC) to track metadata pointers in Git while keeping the actual image files in a separate, cost-effective cloud storage.

  • Store periodic snapshots of the dataset as compressed archives in a shared cloud storage location, using a naming convention that corresponds to Git commit hashes.

CompTIA DataX DY0-001 (V1)
Operations and Processes
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot