AWS Certified Data Engineer Associate DEA-C01 Practice Question
An e-commerce company lands raw clickstream JSON files in an S3 bucket every day. Analysts query the files with Amazon Athena, but many queries fail because some events lack required keys and several date fields use multiple formats. The data should be cleansed once so future Athena queries run without additional filters, and the solution should use managed, serverless services with minimal code. Which approach meets these requirements MOST effectively?
COPY the raw data into an Amazon Redshift table with MAXERROR allowed and clean the data with SQL before creating external tables for Athena.
Load the raw files into Amazon QuickSight Spice and use calculated fields to fix bad dates before every dashboard refresh.
Create an AWS Glue ETL job that drops malformed records and standardizes the date field, then writes the results to a partitioned curated S3 prefix queried by Athena.
Define an Athena view that converts the date on the fly and filters out rows where the required key is null.
AWS Glue can ingest the raw JSON files into a DynamicFrame, apply DropMalformedRecords to eliminate rows that are missing mandatory keys, use ApplyMapping or other built-in transforms to normalize the date formats, and then write the curated data back to S3 in a partitioned structure. Because Glue jobs are serverless and can be scheduled, cleansing occurs only once before analysts run Athena on the cleaned location. The other options either leave cleansing to every query (Athena view), require manual work in a visualization layer (QuickSight), or move the data to a different engine and still demand SQL-based cleanup (Redshift COPY), none of which satisfy the requirement for a one-time, code-light, serverless cleaning step.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is AWS Glue and how does it help with ETL tasks?
Open an interactive chat with Bash
Why is creating a partitioned structure in S3 important for Athena queries?
Open an interactive chat with Bash
How does DropMalformedRecords in AWS Glue work?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Operations and Support
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .