AWS Certified Data Engineer Associate DEA-C01 Practice Question

A data engineer is developing a production ML workflow that uses Amazon SageMaker Pipelines to read raw files from Amazon S3, perform data preprocessing, train a model, and deploy the model to a SageMaker endpoint. The company must keep an auditable, end-to-end record of every dataset, processing job, model version, and endpoint created by the pipeline while writing as little custom tracking code as possible. Which solution meets these requirements?

  • Enable SageMaker ML Lineage Tracking in the SageMaker Pipeline so that each step automatically registers its artifacts and relationships, then query the lineage graph through the SageMaker Lineage API.

  • Turn on AWS CloudTrail for all SageMaker API calls and analyze the resulting logs with Amazon Athena to reconstruct the lineage of artifacts.

  • Refactor the workflow into AWS Step Functions and enable AWS X-Ray tracing so that each state transition captures lineage information for audit queries.

  • Run an AWS Glue crawler after every pipeline step and store the results in the AWS Glue Data Catalog to represent lineage between datasets, jobs, and models.

AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot