AWS Certified Data Engineer Associate DEA-C01 Practice Question
A nightly Apache Spark job runs on Amazon EMR. The job fails intermittently, and data engineers must be able to search driver and executor logs from the last 30 days with Amazon Athena while keeping administration effort and costs low. Which approach meets these requirements?
Configure the EMR cluster to send logs to Amazon CloudWatch Logs, create a nightly export task to Amazon S3, and point Athena to the exported files.
Enable Amazon CloudWatch Logs for the EMR cluster, set log retention to 30 days, and query the logs with CloudWatch Logs Insights as needed.
Configure the EMR cluster to archive application logs to Amazon S3, run an AWS Glue crawler on the log bucket to catalog the files, and query the cataloged tables with Athena.
Stream EMR logs to Amazon Kinesis Data Firehose and deliver them to Amazon OpenSearch Service; use OpenSearch Dashboards for ad-hoc searches.
Archiving EMR application logs directly to Amazon S3 stores the files in durable, low-cost storage without running additional infrastructure. An AWS Glue crawler can automatically infer the schema of the log files and add them to the AWS Glue Data Catalog. Because Athena can query any cataloged data in S3, engineers can immediately run ad-hoc SQL against the logs. This solution meets the 30-day retention goal through an S3 lifecycle rule and requires no recurring export tasks or streaming services.
Using only CloudWatch Logs would require either deploying the optional Athena CloudWatch connector or exporting the logs to S3 before querying, adding operational overhead. Exporting CloudWatch Logs to S3 each night likewise introduces maintenance tasks and extra charges. Streaming through Kinesis Data Firehose to OpenSearch Service is more complex and expensive than needed for simple log searches.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What is Amazon EMR and why is it relevant for Apache Spark jobs?
Open an interactive chat with Bash
How does AWS Glue help in querying logs stored in Amazon S3 with Athena?
Open an interactive chat with Bash
Why is Amazon S3 recommended over alternatives like CloudWatch Logs for this solution?
Open an interactive chat with Bash
How does an AWS Glue crawler work for log files?
Open an interactive chat with Bash
Why is Amazon S3 ideal for storing EMR logs compared to other services?
Open an interactive chat with Bash
What is the advantage of using Athena for querying EMR logs?
Open an interactive chat with Bash
AWS Certified Data Engineer Associate DEA-C01
Data Operations and Support
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .