AWS Certified Data Engineer Associate DEA-C01 Practice Question

A company receives a 500 MB CSV file from a partner each night. The file must land in Amazon S3, then be converted to Parquet and stored in a different S3 bucket. The data team wants a fully managed solution that runs at 02:00 UTC every day and sends failure notifications. Which approach meets these requirements with the least operational effort?

  • Use AWS Data Pipeline to spin up an Amazon EMR cluster on a daily schedule, run a Spark job that converts the file, and terminate the cluster when complete.

  • Run a cron job on an Amazon EC2 instance that downloads the file, converts it locally with a Python script, uploads the Parquet file to the destination bucket, and sends CloudWatch alarm emails on failure.

  • Create an AWS Transfer Family SFTP endpoint that writes to an S3 landing prefix. Configure an Amazon EventBridge rule to trigger an AWS Glue job at 02:00 UTC to convert the CSV to Parquet and publish errors to an Amazon SNS topic.

  • Configure an S3 event notification on the landing bucket to invoke an AWS Lambda function that reads the CSV object, converts it to Parquet, and writes the result to another bucket.

AWS Certified Data Engineer Associate DEA-C01
Data Ingestion and Transformation
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot