AWS Certified Data Engineer Associate DEA-C01 Practice Question

A company runs two Amazon EMR on EC2 clusters in separate VPCs that process the same Parquet data stored in an Amazon S3 data lake. The analytics team also uses Amazon Athena for ad-hoc queries. The team wants a single place to store table definitions, automatically track schema changes, and avoid managing its own Hive metastore infrastructure. Which approach meets these requirements with minimal operational overhead?

  • Deploy a dedicated MySQL Hive metastore in each VPC, schedule nightly metadata exports to an S3 bucket, and have Athena load the exports before every query.

  • Configure each EMR cluster to use the AWS Glue Data Catalog as its external Hive metastore and grant Athena IAM permissions to the catalog.

  • Store Avro schema files in an S3 location and configure Spark jobs and Athena to reference the files directly at runtime.

  • Enable AWS Lake Formation on the S3 data lake and rely on the default settings without changing the EMR metastore configuration.

AWS Certified Data Engineer Associate DEA-C01
Data Store Management
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot