AWS Certified Data Engineer Associate DEA-C01 Practice Question

A data engineering team processes log files stored in Amazon S3. Nightly AWS Glue ETL jobs write curated data back to S3, while analysts run ad-hoc queries with Amazon Athena and Apache Spark on Amazon EMR. Maintaining separate metastores for each service has resulted in schema drift and extra administration. The team needs a single, serverless data catalog that all three services can reference directly, with the least operational overhead. Which approach satisfies these requirements?

  • Use the AWS Glue Data Catalog as the unified metastore and configure both Athena and EMR to reference it.

  • Create external schemas in Amazon Redshift and have Athena and EMR issue federated queries against them.

  • Run an Apache Hive metastore on the EMR primary node and connect Athena to it with AWS Glue connectors.

  • Store table metadata in an Amazon DynamoDB table and update Athena and EMR Spark jobs to read from it using custom code.

AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot