AWS Certified Data Engineer Associate DEA-C01 Practice Question

A retail company lands daily CSV files in Amazon S3. Before an AWS Glue ETL job loads the data, the team must automatically confirm that all mandatory columns are not null and that the total_price field is between 0 and 10,000. The solution must emit pass/fail results to Amazon EventBridge and block the ETL if any rule fails, while minimizing custom code. Which approach meets these requirements?

  • Create an AWS Glue DataBrew profile job with a ruleset that enforces the null and range checks. Configure an EventBridge rule that listens for a DataBrew Ruleset Validation Result event with a result of SUCCEEDED and then starts the Glue ETL job.

  • Run a scheduled Amazon Athena query that counts rows with nulls or out-of-range totals, store the result in S3, invoke an AWS Lambda function to publish a custom CloudWatch metric, and have AWS Step Functions decide whether to start the Glue ETL job.

  • Enable Amazon Macie on the S3 bucket to detect data issues, route Macie findings to EventBridge, and launch the Glue ETL job only when no new findings are generated.

  • Use the open-source Deequ library on an Amazon EMR cluster to run Spark data quality tests, push results to CloudWatch through the CloudWatch agent, and trigger the Glue ETL job from AWS Step Functions if all tests pass.

AWS Certified Data Engineer Associate DEA-C01
Data Operations and Support
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot