AWS Certified Data Engineer Associate DEA-C01 Practice Question

A company's nightly AWS Glue 3.0 Spark job reads 3 TB of Parquet data from Amazon S3 and loads it into an Amazon Redshift table. The job used to finish in 40 minutes, but the most recent runs take more than 2 hours, and several tasks stay in the READY state for an extended time. To quickly identify stage bottlenecks such as partition skew or insufficient executor memory without increasing cost, which action should a data engineer perform first?

  • Increase the job's maximum DPUs and enable continuous logging to Amazon CloudWatch Logs.

  • Configure the CloudWatch log group for the job to stream to Amazon S3 and query the logs with Amazon Athena.

  • Open the job's Spark UI from the AWS Glue console and review stage and executor metrics in the Spark History Server.

  • Enable job bookmarks so the job can skip partitions that have already been processed.

AWS Certified Data Engineer Associate DEA-C01
Data Operations and Support
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot