AWS Certified Data Engineer Associate DEA-C01 Practice Question

Your company runs several Amazon EMR clusters that execute nightly Spark jobs. The engineering team wants a managed solution to aggregate application and step logs from every cluster, retain the data for 30 days, and provide near-real-time search and interactive dashboards to troubleshoot performance issues. Which approach meets these requirements with the least operational overhead?

  • Install Filebeat on every EMR node to forward logs to an ELK stack running on a separate always-on EMR cluster and delete indices older than 30 days.

  • Stream logs from the EMR master node to Amazon Kinesis Data Streams, invoke AWS Lambda to load the records into Amazon DynamoDB, and build Amazon QuickSight analyses on the table.

  • Enable log archiving to Amazon S3, run Amazon Athena queries against the logs, and visualize the results in Amazon QuickSight with a 30-day lifecycle policy on the S3 bucket.

  • Configure each EMR cluster to publish its logs to CloudWatch Logs, create a CloudWatch Logs subscription that streams the logs to an Amazon OpenSearch Service domain, and set a 30-day retention policy on the log groups.

AWS Certified Data Engineer Associate DEA-C01
Data Operations and Support
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot