CompTIA DataX DY0-001 (V1) Practice Question

During hyper-parameter tuning of a Ridge regression model you standardize all 120 numeric predictors and evaluate five penalty values (λ = 0, 0.1, 1, 10, 100) with 10-fold cross-validation. The average validation MSE drops from λ = 0 to λ ≈ 5, then climbs steeply once λ exceeds 100. Pre-processing and data splits have already been verified. Which explanation best accounts for the rise in validation error at very large λ values?

A high λ forces some coefficients exactly to zero, removing important predictors and increasing variance in the folds.
Large λ amplifies multicollinearity, making the coefficient estimates more sensitive to small changes in the data.
The matrix (XᵀX + λI) becomes non-invertible at large λ values, causing numerical instability that inflates the error.
A very large λ over-penalizes the weights, shrinking almost all coefficients toward zero and introducing high bias, so the model underfits the data.

CompTIA DataX DY0-001 (V1)

Machine Learning

Your Score:

SAVE $64

CompTIA DataX Voucher

v1 / DY0-001

$529.00 $465.00

Bash, the Crucial Exams Chat Bot

AI Bot

CompTIA DataX DY0-001 (V1) Practice Question

Answer Description

Ask Bash

Why does a very large λ cause underfitting in Ridge regression?

How is Ridge regression different from LASSO in handling coefficients?

What does adding λI improve in the (XᵀX + λI) matrix in Ridge regression?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

CompTIA DataX DY0-001 (V1) Practice Question

Report Issue

Answer Description

Ask Bash

Why does a very large λ cause underfitting in Ridge regression?

How is Ridge regression different from LASSO in handling coefficients?

What does adding λI improve in the (XᵀX + λI) matrix in Ridge regression?

Report Issue