AWS Certified Data Engineer Associate DEA-C01 Practice Question

A retailer captures clickstream events in an existing Amazon Kinesis Data Stream. A data engineer must continuously land these events in an S3-based data lake as Apache Parquet files that are partitioned by event_date. The solution must require the least custom code, tolerate evolving event schemas without breaking downstream queries, and provide at-least-once processing guarantees. Which approach meets these requirements?

  • Attach an AWS Lambda function to the Kinesis stream that batches 100 records, converts them to Parquet, and uploads each batch to Amazon S3.

  • Create an AWS Glue streaming ETL job that reads from the Kinesis data stream, enables job bookmarks, and writes DynamicFrames in Parquet format partitioned by event_date to Amazon S3.

  • Develop a Spark Structured Streaming application on an always-on Amazon EMR cluster that consumes the Kinesis stream and writes partitioned Parquet files to Amazon S3.

  • Configure an Amazon Kinesis Data Firehose delivery stream with an AWS Lambda transformation that converts records to Parquet and uses dynamic partitioning to deliver data to Amazon S3.

AWS Certified Data Engineer Associate DEA-C01
Data Ingestion and Transformation
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot