AWS Certified Data Engineer Associate DEA-C01 Practice Question

An AWS Step Functions pipeline ingests hourly JSON files from Amazon S3 created by 1,000 devices. Each file must contain 60 records, and every record must include the keys device_id, ts, and temperature. The team needs a serverless check for these completeness rules before loading data into Amazon Redshift. If a rule fails, the pipeline must halt and a CloudWatch alarm must alert the on-call staff. Which approach is most cost-effective?

  • Develop an AWS Lambda function that reads the file from S3, counts rows, checks each JSON object, and emits a custom CloudWatch metric that triggers an alarm on anomalies.

  • Add a Step Functions task that runs an Amazon Athena query to COUNT(*) rows and test for NULL values; raise a CloudWatch alarm if the query result violates thresholds.

  • Create an AWS Glue DataBrew profile job with data-quality rules for the required columns and a RowCount = 60 rule. Start the job from Step Functions and use an EventBridge rule that catches a FAILED validationState to raise a CloudWatch alarm and fail the state machine.

  • Load the file into Amazon Redshift with the COPY command configured with MAXERROR 0 and ACCEPTANYDATE, then trigger a CloudWatch alarm if the COPY operation fails.

AWS Certified Data Engineer Associate DEA-C01
Data Operations and Support
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot