CompTIA DataX DY0-001 (V1) Practice Question

A data scientist is performing exploratory data analysis (EDA) on a dataset for a real estate valuation model. The dataset includes a feature named view_quality, which is rated by professional assessors on a custom scale: "No View", "Partial Obstruction", "Standard", "Good", and "Excellent". The team is debating the most appropriate way to handle this feature for a multiple linear regression model versus a gradient boosting machine (GBM).

Which of the following statements most accurately describes the view_quality feature and the implications for its use in modeling?

  • view_quality is a continuous variable that has been binned. For use in a linear regression model, the mid-points of the implied continuous range for each category should be calculated and used as the feature value. For a GBM, this feature can be used directly.

  • view_quality is an ordinal variable. For a linear regression model, treating it as a continuous integer (e.g., 0-4) assumes equidistant spacing between categories, which is likely false and could violate model assumptions. For a GBM, integer encoding is generally effective as the model can create splits at any point along the ordered values.

  • view_quality is a discrete variable. It can be used directly in a linear regression model without transformation because the model will interpret the integer values as distinct points. For a GBM, it should be treated as a categorical feature to allow for optimal splits.

  • view_quality is a nominal variable. It must be one-hot encoded for both linear regression and GBMs to avoid introducing a false sense of order, which would negatively impact the performance of both model types.

CompTIA DataX DY0-001 (V1)
Modeling, Analysis, and Outcomes
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot