AWS Certified Data Engineer Associate DEA-C01 Practice Question

A data platform stores raw click-stream logs in an S3 landing bucket. One AWS Glue job writes partitioned Parquet data to an S3 curated bucket. Another Glue job joins the curated data with a Redshift user table and outputs results to an S3 reporting bucket queried by Athena. Auditors must trace each dashboard to the landing files using only AWS-managed services. Which solution delivers end-to-end data lineage?

  • Enable object-level logging for the S3 buckets in AWS CloudTrail and use Amazon Athena to join the CloudTrail logs with the AWS Glue job run history.

  • Add S3 version IDs of all source objects as column values during each transformation and store the augmented datasets in Amazon Redshift for later querying.

  • Use Amazon SageMaker ML Lineage Tracking APIs to register each Glue job run and its input and output locations.

  • Register the landing, curated, and reporting S3 locations as tables in the AWS Glue Data Catalog and run both ETL jobs from AWS Glue so the Glue lineage graph automatically captures object-level transformations.

AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot