AWS Certified Data Engineer Associate DEA-C01 Practice Question
A retailer captures clickstream events in an existing Amazon Kinesis Data Stream. A data engineer must continuously land these events in an S3-based data lake as Apache Parquet files that are partitioned by event_date. The solution must require the least custom code, tolerate evolving event schemas without breaking downstream queries, and provide at-least-once processing guarantees. Which approach meets these requirements?
Attach an AWS Lambda function to the Kinesis stream that batches 100 records, converts them to Parquet, and uploads each batch to Amazon S3.
Create an AWS Glue streaming ETL job that reads from the Kinesis data stream, enables job bookmarks, and writes DynamicFrames in Parquet format partitioned by event_date to Amazon S3.
Develop a Spark Structured Streaming application on an always-on Amazon EMR cluster that consumes the Kinesis stream and writes partitioned Parquet files to Amazon S3.
Configure an Amazon Kinesis Data Firehose delivery stream with an AWS Lambda transformation that converts records to Parquet and uses dynamic partitioning to deliver data to Amazon S3.
An AWS Glue streaming ETL job natively reads from Amazon Kinesis Data Streams, converts incoming records to DynamicFrames that automatically track schema changes, and can write partitioned Parquet datasets to Amazon S3. Enabling job bookmarks (checkpointing) gives at-least-once semantics. Because Glue generates most of the Spark code, the engineer only needs minimal custom logic. The other options each require more code, provide weaker schema-evolution support, or are less operationally efficient for continuous ingestion.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What are AWS Glue DynamicFrames and how do they handle schema evolution?
Open an interactive chat with Bash
How does enabling job bookmarks in AWS Glue ensure at-least-once processing?
Open an interactive chat with Bash
Why is Apache Parquet a preferred format for storing data in S3-based data lakes?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Ingestion and Transformation
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .