CompTIA DataX DY0-001 (V1) Practice Question

A data science team at a financial services company is developing a model to predict the probability of loan default. The dataset contains over 100 features, and exploratory data analysis reveals strong multicollinearity among several predictors, such as "debt-to-income ratio" and "credit utilization rate". The primary business objective is to create a predictive model that is also parsimonious, automatically performing feature selection to identify the most significant predictors of default. Which of the following models is most appropriate for this task?

Linear Discriminant Analysis (LDA)
Ordinary Least Squares (OLS) regression
Least Absolute Shrinkage and Selection Operator (LASSO) regression
Ridge regression

Report Issue

Answer Description

The correct answer is LASSO (Least Absolute Shrinkage and Selection Operator) regression. This model is the most suitable choice because it meets both of the scenario's requirements: handling multicollinearity and performing automatic feature selection. LASSO uses L1 regularization, which adds a penalty term proportional to the absolute value of the coefficients. A key feature of this penalty is its ability to shrink the coefficients of less important features to exactly zero, effectively removing them from the model. This results in a simpler, more interpretable (parsimonious) model that highlights the most significant predictors, which aligns perfectly with the business objective.

Ridge regression is incorrect because while it effectively handles multicollinearity by using L2 regularization to shrink coefficients, it does not perform feature selection. The coefficients are shrunk towards zero but never become exactly zero, meaning all features are retained in the final model.
Ordinary Least Squares (OLS) regression is incorrect because it performs poorly in the presence of strong multicollinearity. Multicollinearity violates one of the key assumptions of OLS, leading to unstable coefficient estimates with high variance, making the model unreliable for interpretation and prediction.
Linear Discriminant Analysis (LDA) is incorrect because it is primarily a classification and dimensionality reduction technique, not a regression method designed for this type of feature selection problem. While LDA reduces dimensions by finding linear combinations of features that best separate classes, it does not provide a parsimonious regression model in the way that LASSO does.

Ask Bash

Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.

What is multicollinearity and why is it a problem in regression models?

Open an interactive chat with Bash

How does LASSO regression perform feature selection compared to Ridge regression?

Open an interactive chat with Bash

Why is Ordinary Least Squares (OLS) unsuitable for datasets with multicollinearity?

Open an interactive chat with Bash

CompTIA DataX DY0-001 (V1)

Machine Learning

Your Score:

SAVE $64

CompTIA DataX Voucher

v1 / DY0-001

$529.00 $465.00

Bash, the Crucial Exams Chat Bot

AI Bot

CompTIA DataX DY0-001 (V1) Practice Question

Answer Description

Ask Bash

What is multicollinearity and why is it a problem in regression models?

How does LASSO regression perform feature selection compared to Ridge regression?

Why is Ordinary Least Squares (OLS) unsuitable for datasets with multicollinearity?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams