AWS Certified Data Engineer Associate DEA-C01 Practice Question
A Kinesis Data Stream with three shards ingests up to 2,500 clickstream events per second. The data engineering team must store the raw events in Amazon S3 every 60 seconds as compressed Apache Parquet files for a downstream batch job. The solution must scale automatically, require little or no code, and must not reduce throughput available to existing stream consumers. Which approach meets these requirements in the most cost-effective way?
Deploy an AWS Lambda function as a stream consumer that batches records for 60 seconds, converts them to Parquet with the AWS SDK, and writes the files to S3.
Create a Kinesis Data Firehose delivery stream that uses the existing Kinesis Data Stream as its source, set the S3 bucket as the destination, enable Parquet conversion with GZIP compression, and configure a 60-second buffer interval.
Run a Kinesis Client Library (KCL) application on an EC2 Auto Scaling group that aggregates one-minute batches and uploads compressed Parquet files to S3.
Launch an Amazon Kinesis Data Analytics application that reads the stream and uses a custom SQL sink to write Parquet files to S3 every minute.
A Kinesis Data Firehose delivery stream can be connected directly to an existing Kinesis Data Stream. When configured as a source for Firehose, it acts as an enhanced fan-out consumer, so it does not share the 2 MiB/second/shard read limit with other applications. It automatically scales with the incoming throughput, buffers records for a minimum of 60 seconds, converts to Parquet, optionally compresses the output, and writes the data to S3 without the need to build or operate custom code. Lambda, Kinesis Data Analytics, or KCL applications would require custom development and, in most cases, additional compute that increases cost and operational overhead.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Amazon Kinesis Data Firehose?
Open an interactive chat with Bash
How does Parquet file format help in data storage and processing?
Open an interactive chat with Bash
What is the 2 MiB/second/shard read limit in Kinesis Data Streams?
Open an interactive chat with Bash
What is Kinesis Data Firehose?
Open an interactive chat with Bash
How does Kinesis Data Firehose handle data conversion and compression?
Open an interactive chat with Bash
What is enhanced fan-out in Kinesis, and why is it important in this scenario?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Ingestion and Transformation
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .