AWS Certified Data Engineer Associate DEA-C01 Practice Question
Your company stores web server logs as hourly CSV objects in a landing Amazon S3 bucket. Data engineers must convert each file to snappy-compressed Parquet partitioned by date in another S3 bucket, update the AWS Glue Data Catalog table, and keep operational overhead as low as possible. Which solution will satisfy these business requirements in the MOST cost-effective, maintenance-efficient way?
Launch a transient Amazon EMR cluster on a schedule; run a Spark script that converts and partitions the file, then terminates the cluster after completion.
Create an AWS Glue Spark ETL job triggered by an S3 event to convert the CSV file to snappy-compressed Parquet, write it to a date-partitioned path in the analytics bucket, and update the Glue Data Catalog.
Configure Amazon Kinesis Data Firehose with S3 as the destination and record-format conversion enabled, and invoke the Firehose PutRecord API from a Lambda function when each object is created.
Use AWS Data Pipeline to run a daily EC2 task that executes the open-source parquet-mr tool to convert incoming CSV files and copy them to the analytics bucket.
An AWS Glue ETL job is serverless, so there are no clusters to provision or maintain. The job can be triggered automatically by an Amazon S3 event, read the incoming CSV object, convert it to snappy-compressed Parquet, write it to the target bucket using date partitions, and then call a crawler or Glue table update. This meets the transformation, catalog-update, and low-overhead requirements at pay-per-use pricing. Amazon EMR and AWS Data Pipeline both require managing EC2 infrastructure, increasing cost and operational work. Kinesis Data Firehose supports format conversion, but an extra Lambda invocation plus Firehose delivery stream for each object adds unnecessary complexity and cost for a batch file already in S3.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is AWS Glue Spark ETL?
Open an interactive chat with Bash
What is snappy-compressed Parquet format?
Open an interactive chat with Bash
How do S3 events trigger AWS Glue jobs?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Ingestion and Transformation
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .