CompTIA DataX DY0-001 (V1) Practice Question

During feature engineering for a multinomial logistic-regression model that predicts customer churn, a data scientist converts the nominal column "EmployerIndustry" (≈50 unique string values) to numeric form by assigning each category an integer from 0 through 49 with a basic label-encoding step. The single encoded column is then supplied to the model. Cross-validation reveals unstable coefficients and erratic performance.

Which statement best explains why this encoding choice is inappropriate for the selected algorithm?

The integer codes impose a false ordinal relationship among industries, leading logistic regression to treat higher codes as having proportionally larger (or smaller) effects on churn probability.
Label encoding expands the feature into dozens of sparse columns, and the resulting high dimensionality destabilizes the optimizer.
LabelEncoder in scikit-learn supports at most 32 distinct categories; using it with ≈50 levels causes excessive variance in the estimated coefficients.
Label encoding internally converts the column to a single binary flag, discarding most of the information needed by the model.

CompTIA DataX DY0-001 (V1)

Modeling, Analysis, and Outcomes

Your Score:

SAVE $64

CompTIA DataX Voucher

v1 / DY0-001

$529.00 $465.00

Bash, the Crucial Exams Chat Bot

AI Bot

CompTIA DataX DY0-001 (V1) Practice Question

Answer Description

Ask Bash

Why is label encoding inappropriate for nominal features in regression models?

What alternative encoding methods can handle nominal features better in regression?

How does one-hot encoding affect the dimensionality of a dataset?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

CompTIA DataX DY0-001 (V1) Practice Question

Report Issue

Answer Description

Ask Bash

Why is label encoding inappropriate for nominal features in regression models?

What alternative encoding methods can handle nominal features better in regression?

How does one-hot encoding affect the dimensionality of a dataset?

Report Issue