CompTIA DataX DY0-001 (V1) Practice Question

During an early design iteration, your team is fine-tuning a 250-million-parameter Transformer on a single 24 GB GPU. When you raise the mini-batch size from 16 to 64, training fails with an out-of-memory (OOM) error, and the budget does not allow additional hardware. You have one day to rerun the experiment and want to keep the architecture and hyperparameter search results unchanged. Which change to the training configuration is the most appropriate way to satisfy the resource constraint while minimizing impact on model accuracy and development time?

  • Enable mixed-precision (FP16/bfloat16) training with automatic loss scaling.

  • Pad every input sequence to exactly 512 tokens so tensor shapes are consistent across batches.

  • Double the model's hidden dimension but freeze all even-numbered layers to reduce gradient updates.

  • Replace the AdamW optimizer with standard SGD without momentum to eliminate optimizer state.

CompTIA DataX DY0-001 (V1)
Modeling, Analysis, and Outcomes
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot