CompTIA DataX DY0-001 (V1) Practice Question

During training you notice that a deep multilayer perceptron that uses tanh(x) in every hidden layer begins to learn extremely slowly after the first few epochs. You suspect the gradients are vanishing as they are back-propagated. From a mathematical standpoint, which property of the tanh activation most directly explains why its use can drive gradients toward zero when neuron inputs have large magnitude?

  • Its first derivative is 1 − tanh²(x), which tends to zero as |x| becomes large, so back-propagated gradients are repeatedly attenuated.

  • Its output range is strictly 0 to 1, so activations stay positive and bias the gradient toward zero.

  • Its second derivative is a constant 1, so there is no curvature change and gradients get stuck at saddle points instead of vanishing.

  • Its first derivative equals x for |x| > 1, causing gradients to grow without bound and leading to exploding rather than vanishing gradients.

CompTIA DataX DY0-001 (V1)
Machine Learning
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot