CompTIA DataX DY0-001 (V1) Practice Question

During training you notice that a deep multilayer perceptron that uses tanh(x) in every hidden layer begins to learn extremely slowly after the first few epochs. You suspect the gradients are vanishing as they are back-propagated. From a mathematical standpoint, which property of the tanh activation most directly explains why its use can drive gradients toward zero when neuron inputs have large magnitude?

Its first derivative is 1 − tanh²(x), which tends to zero as |x| becomes large, so back-propagated gradients are repeatedly attenuated.
Its output range is strictly 0 to 1, so activations stay positive and bias the gradient toward zero.
Its second derivative is a constant 1, so there is no curvature change and gradients get stuck at saddle points instead of vanishing.
Its first derivative equals x for |x| > 1, causing gradients to grow without bound and leading to exploding rather than vanishing gradients.

CompTIA DataX DY0-001 (V1)

Machine Learning

Your Score:

SAVE $64

CompTIA DataX Voucher

v1 / DY0-001

$529.00 $465.00

Bash, the Crucial Exams Chat Bot

AI Bot

CompTIA DataX DY0-001 (V1) Practice Question

Answer Description

Ask Bash

Why does the derivative of tanh(x) approach zero for large |x|?

What is the vanishing gradient problem in neural networks?

How can the vanishing gradient problem be mitigated in deep networks?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

CompTIA DataX DY0-001 (V1) Practice Question

Report Issue

Answer Description

Ask Bash

Why does the derivative of tanh(x) approach zero for large |x|?

What is the vanishing gradient problem in neural networks?

How can the vanishing gradient problem be mitigated in deep networks?

Report Issue