AWS Certified Data Engineer Associate DEA-C01 Practice Question
A financial company runs a 500 GB on-premises MySQL 5.7 database that produces about 1,000 write transactions per second. The data engineering team must ingest every change into Amazon Kinesis Data Streams within 1 minute so a downstream application can perform near-real-time analytics. The solution must let the team replay the data if the consumer application fails and should require the least operational effort. Which approach meets these requirements?
Create an AWS DMS replication instance with a MySQL source endpoint and a Kinesis Data Streams target endpoint. Configure a full-load-plus-CDC task, enable StreamMode, specify a partition-key expression, and set the stream retention period to 24 hours.
Set up AWS DataSync to copy hourly database backups to Amazon S3 and launch an Amazon EMR cluster that streams the copied files into Amazon Kinesis Data Streams.
Use AWS DMS to migrate the database to Amazon S3 with a full-load-only task. Configure Amazon S3 event notifications to start an AWS Glue job that writes new objects to Kinesis Data Streams.
Deploy an AWS Lambda function that reads MySQL binary logs by using the mysqlbinlog utility and publishes each event to Amazon Kinesis Data Streams. Trigger the function on a fixed schedule with Amazon EventBridge.