AWS Certified Data Engineer Associate DEA-C01 Practice Question
A data engineering team must ingest thousands of click-stream events per second from a public website, enrich each record, and store the results in Amazon S3. The solution must provide sub-second end-to-end latency, allow any record to be replayed when a downstream transformation fails, and minimize custom state management. Which approach best meets these requirements according to AWS best practices?
Configure an Amazon Kinesis Data Firehose delivery stream with data transformation enabled and an Amazon S3 destination.
Create an Amazon Kinesis Data Stream and use an AWS Glue streaming ETL job to read from the stream, enrich the records, and write the output to Amazon S3.
Send events to Amazon EventBridge, invoke an AWS Lambda function for enrichment, and write the results to Amazon S3.
Stream events directly to AWS Lambda using an Amazon Kinesis Data Stream event source mapping; the function enriches data and stores it in Amazon S3.
Using an AWS Glue streaming ETL job that reads from Amazon Kinesis Data Streams makes the consumer stateful without requiring engineers to manage that state themselves. Glue's Spark Structured Streaming engine automatically writes checkpoint data-including the end offset for each shard-to an Amazon S3 location that you specify. If a failure occurs, simply deleting or rolling back the checkpoint directory causes the job to start from an earlier position and replay any data that is still within the stream's retention period.
EventBridge with Lambda can replay events only after you first enable an archive and wait for events to arrive in that archive (AWS recommends a delay of about 10 minutes), which conflicts with the near-instant replay requirement. Kinesis Data Firehose offers no built-in way to resend records that have already been delivered, and a Lambda consumer that uses a Kinesis event-source mapping would require creating a new mapping or manipulating iterator positions to force reprocessing. Therefore, the Glue Streaming solution best satisfies throughput, latency, replayability, and operational-overhead constraints.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Amazon Kinesis Data Streams and how does it facilitate streaming data?
Open an interactive chat with Bash
How does AWS Glue streaming ETL support state management during data processing?
Open an interactive chat with Bash
Why doesn't Amazon Kinesis Data Firehose support data replay, and what is its primary purpose?
Open an interactive chat with Bash
What is Amazon Kinesis Data Streams and how does it work?
Open an interactive chat with Bash
What is an AWS Glue streaming ETL job?
Open an interactive chat with Bash
How does checkpointing work in AWS Glue streaming ETL jobs?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Ingestion and Transformation
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .