AWS Certified Data Engineer Associate DEA-C01 Practice Question

An analytics team stores click-stream data as Parquet files in Amazon S3, partitioned by year/month/day (for example, s3://datalake/events/year=2025/month=10/day=07/). A daily AWS Glue crawler adds partitions to the AWS Glue Data Catalog so analysts can query the table in Amazon Athena. After two years the crawler's runtime and cost have increased significantly. The team wants to keep automatic partition discovery while minimizing ongoing cost and administration. What should they do?

  • Enable partition projection for the Athena table, configure the year, month, and day keys, and stop scheduling the AWS Glue crawler.

  • Change the existing crawler's recrawl policy to crawl new folders only and enable partition indexes on the Data Catalog table.

  • Create an AWS Lambda function that runs MSCK REPAIR TABLE after each crawler run to update the Data Catalog incrementally.

  • Switch to Amazon S3 event notifications that invoke an AWS Glue job calling the batchCreatePartition API to add each new partition to the Data Catalog.

AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot