CompTIA DataX DY0-001 (V1) Practice Question

A machine learning engineer is training a deep neural network on a large dataset using a high-performance GPU. They observe that while using stochastic gradient descent (batch size of 1), the training progress is slow due to inefficient hardware utilization and the loss curve is extremely erratic. Conversely, attempting full-batch gradient descent results in out-of-memory errors. The engineer's goal is to achieve a stable and computationally efficient training process. Which of the following strategies directly addresses this specific trade-off?

  • Decrease the learning rate and add momentum to smooth out the erratic loss curve observed with stochastic updates.

  • Utilize mini-batch gradient descent with a batch size chosen to balance the noisy gradient estimates of stochastic updates and the memory requirements of full-batch processing, thereby enabling efficient parallel computation on the GPU.

  • Switch to an Adam optimizer, as its adaptive learning rate is inherently designed to handle large datasets without requiring batching.

  • Implement batch normalization after each layer to stabilize the erratic loss curve caused by the stochastic updates.

CompTIA DataX DY0-001 (V1)
Machine Learning
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot