AWS Certified Data Engineer Associate DEA-C01 Practice Question
Your company stores raw click-stream events as gzip-compressed JSON files in an S3 bucket partitioned by dt=YYYY-MM-DD. Analysts report that some records occasionally lack the required session_id field. You must generate a curated dataset in another S3 prefix that contains only valid records, can be refreshed daily, and uses standard SQL while remaining fully serverless and cost-efficient. Which solution meets these requirements?
Provision an Amazon EMR cluster with Hive, schedule a daily HiveQL job that selects only records with a non-null session_id and writes the output to another S3 prefix, then terminate the cluster.
Run a CREATE TABLE AS SELECT query in Amazon Athena that filters out rows where session_id IS NULL and writes the results to a new S3 prefix; use Athena Scheduled Queries to execute the statement daily.
Create an AWS Glue DataBrew project pointing at the S3 dataset, add a recipe step to delete rows with null session_id, and run the DataBrew job on a daily schedule.
Load the raw files into Amazon Redshift Serverless each day, issue a SQL query to remove null session_id values, and UNLOAD the cleaned data back to a different S3 location.
Amazon Athena is a fully managed, serverless service that lets you run standard SQL against data in Amazon S3 and pay only for the data scanned. A CREATE TABLE AS SELECT (CTAS) query can write the filtered result set to a new S3 location and inherit the schema automatically. Adding a simple WHERE session_id IS NOT NULL clause removes invalid rows. Athena Scheduled Queries can rerun that CTAS statement each day without additional infrastructure. Loading data into Redshift Serverless or spinning up Amazon EMR incurs higher cost and additional administration. AWS Glue DataBrew is serverless but relies on recipe steps rather than plain SQL, so it does not satisfy the explicit SQL requirement.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Amazon Athena and how does it work?
Open an interactive chat with Bash
What is a CREATE TABLE AS SELECT (CTAS) query in Athena?
Open an interactive chat with Bash
What are Athena Scheduled Queries and how do they work?
Open an interactive chat with Bash
What is Amazon Athena?
Open an interactive chat with Bash
What is a CREATE TABLE AS SELECT (CTAS) query in Athena?
Open an interactive chat with Bash
How do Athena Scheduled Queries work?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Operations and Support
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .