AWS Certified Data Engineer Associate DEA-C01 Practice Question

A travel company ingests daily reservation files into an S3 data lake. An AWS Glue ETL job converts the CSV input to Parquet for downstream analytics. The data owner requires that every output record contains non-empty values for the BookingId and TravelerEmail columns. Records that do not meet this rule must be excluded from the Parquet dataset and stored separately for review. Which solution will satisfy the requirement with the least custom code and without adding additional compute services?

  • Insert a DropFields transformation for BookingId and TravelerEmail; rows with null values will automatically be excluded when the columns are dropped.

  • Use a ResolveChoice transformation to cast both columns to string; Glue implicitly skips rows where the cast fails because of null or empty values.

  • Add an ApplyMapping transformation that converts BookingId and TravelerEmail to the non-nullable data type; Glue will remove rows that violate the schema.

  • Add a filter transformation to the DynamicFrame that returns True only when both columns are not null and not an empty string, then write the rejected records to a different S3 prefix.

AWS Certified Data Engineer Associate DEA-C01
Data Operations and Support
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot