AWS Certified Data Engineer Associate DEA-C01 Practice Question

A Kinesis Data Stream with three shards ingests up to 2,500 clickstream events per second. The data engineering team must store the raw events in Amazon S3 every 60 seconds as compressed Apache Parquet files for a downstream batch job. The solution must scale automatically, require little or no code, and must not reduce throughput available to existing stream consumers. Which approach meets these requirements in the most cost-effective way?

  • Deploy an AWS Lambda function as a stream consumer that batches records for 60 seconds, converts them to Parquet with the AWS SDK, and writes the files to S3.

  • Create a Kinesis Data Firehose delivery stream that uses the existing Kinesis Data Stream as its source, set the S3 bucket as the destination, enable Parquet conversion with GZIP compression, and configure a 60-second buffer interval.

  • Run a Kinesis Client Library (KCL) application on an EC2 Auto Scaling group that aggregates one-minute batches and uploads compressed Parquet files to S3.

  • Launch an Amazon Kinesis Data Analytics application that reads the stream and uses a custom SQL sink to write Parquet files to S3 every minute.

AWS Certified Data Engineer Associate DEA-C01
Data Ingestion and Transformation
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot