CompTIA DataX DY0-001 (V1) Practice Question

A data science team is developing a fraud detection model for a financial institution. The dataset contains highly sensitive customer information and is severely imbalanced, with fraudulent transactions representing a very small minority class. The primary goal is to generate a high-fidelity synthetic dataset that accurately captures the complex, non-linear correlations found in the original data, which will be used to train a sophisticated deep learning model. A secondary but critical requirement is to minimize the risk of re-identification of individuals from the original dataset.

Given this scenario, which of the following data augmentation techniques is the most appropriate choice?

  • Generate synthetic data by fitting a multivariate normal distribution to the original data's features and sampling from it. This ensures the synthetic data maintains the same mean and covariance structure as the original.

  • Use a Variational Autoencoder (VAE) to learn a latent representation of the data and generate new samples from it. This allows for probabilistic generation of diverse data points.

  • Apply the Synthetic Minority Over-sampling Technique (SMOTE) to the minority class. This method is computationally efficient and directly addresses the class imbalance by creating new minority instances.

  • Implement a Generative Adversarial Network (GAN) trained on the original dataset. This approach excels at learning the underlying data distribution, including complex non-linear relationships, to produce highly realistic synthetic samples.

CompTIA DataX DY0-001 (V1)
Modeling, Analysis, and Outcomes
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

SAVE $64
$529.00 $465.00
Bash, the Crucial Exams Chat Bot
AI Bot