AWS Certified Data Engineer Associate DEA-C01 Practice Question

A company stores CSV files of about 50 MB each in an Amazon S3 landing bucket every 5 minutes. A data engineer must automatically convert each file to Parquet, add an ingestion_timestamp column, and write the result to a separate S3 bucket organized by date and hour. The solution must remain fully serverless, minimize cost, and require little operational management. Which approach meets these requirements?

  • Configure an S3 event notification that invokes an AWS Lambda function using the AWS SDK for pandas to read the CSV object, add the timestamp column, and write a Parquet file to the destination bucket's date/hour prefix.

  • Use a Lambda function to start an on-demand Amazon EMR cluster that runs a conversion script and terminates the cluster after processing each batch of files.

  • Trigger an AWS Step Functions state machine from the S3 event that starts an AWS Glue Spark job to perform the conversion and load the result to the destination bucket.

  • Send the files to an Amazon Kinesis Data Firehose delivery stream that uses a Lambda transformation to convert records to Parquet before writing to the target bucket.

AWS Certified Data Engineer Associate DEA-C01
Data Operations and Support
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot