AWS Certified Data Engineer Associate DEA-C01 Practice Question

An analytics team must build an AWS Glue Spark job that enriches 500 GB of Parquet click-stream data stored in Amazon S3 with a 5 GB customer dimension table that resides in an Amazon RDS for PostgreSQL instance. The solution must minimize infrastructure management, let multiple future jobs reuse the same metadata, and ensure that all traffic stays within the VPC. Which approach meets these requirements?

  • Use AWS DMS to replicate the RDS table into Amazon DynamoDB and query DynamoDB from the Glue Spark job for the customer dimension data.

  • Set up AWS Database Migration Service to export the RDS table to Amazon S3 each night, crawl the exported files, and join them with the click-stream data in the Glue job.

  • Configure Amazon Athena with the PostgreSQL federated query connector and have the Glue job retrieve the customer table by querying Athena during each run.

  • Create an AWS Glue JDBC connection to the RDS endpoint in the VPC, run a crawler with that connection to catalog the customer table, and have the Glue Spark job read the cataloged JDBC table alongside the Parquet files.

AWS Certified Data Engineer Associate DEA-C01
Data Ingestion and Transformation
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot