CompTIA DataX DY0-001 (V1) Practice Question

A data science team is preparing a large customer dataset to train a machine learning model for predicting fraudulent transactions. The dataset contains direct identifiers such as names and email addresses, as well as quasi-identifiers like ZIP codes and dates of birth. To adhere to strict data privacy regulations, the team must de-identify the data before analysis. Which of the following strategies provides the best balance between robustly protecting Personally Identifiable Information (PII) and preserving the analytical value of the features for the model?

Encrypt the entire dataset before loading it into the training environment and decrypt it just before model fitting.
Completely remove all columns identified as direct and quasi-identifiers from the dataset.
Apply a character-masking function to all PII fields, replacing each character with a fixed symbol (e.g., 'X').
Remove the direct identifiers and apply a consistent tokenization scheme to the quasi-identifiers.

CompTIA DataX DY0-001 (V1)

Operations and Processes

Your Score:

SAVE $64

CompTIA DataX Voucher

v1 / DY0-001

$529.00 $465.00

Bash, the Crucial Exams Chat Bot

AI Bot

CompTIA DataX DY0-001 (V1) Practice Question

Answer Description

Ask Bash

What is tokenization in data privacy?

Why are quasi-identifiers preserved in machine learning datasets?

How does tokenization differ from encryption in data handling?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

CompTIA DataX DY0-001 (V1) Practice Question

Report Issue

Answer Description

Ask Bash

What is tokenization in data privacy?

Why are quasi-identifiers preserved in machine learning datasets?

How does tokenization differ from encryption in data handling?

Report Issue