AWS Certified Data Engineer Associate DEA-C01 Practice Question

An analytics team receives hourly CSV files from external vendors. When a file lands in an S3 bucket, it must be validated, transformed with AWS Glue, and loaded into Amazon Redshift. The solution must be serverless, event-driven, include retry logic, and minimize operational overhead. Which architecture best meets these requirements?

  • Create a CloudWatch Events scheduled rule that runs every 5 minutes and invokes a Lambda function. The function lists recently added objects, kicks off an AWS Batch job to transform the data, and then loads the results into Redshift.

  • Deploy Apache Airflow on an EC2 Auto Scaling group and build a DAG that polls the S3 bucket every minute, then starts a Glue job and a Redshift COPY task.

  • Configure an S3 Event Notification to deliver ObjectCreated events to EventBridge, which triggers a Step Functions state machine. The state machine runs a Glue job for transformation, then uses the Redshift Data API to issue a COPY command. Step Functions built-in retries handle transient failures.

  • Set up Kinesis Data Firehose with the S3 bucket as the data source, enable transformation with a Lambda function, and configure the delivery stream to load directly into Amazon Redshift.

AWS Certified Data Engineer Associate DEA-C01
Data Ingestion and Transformation
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot