Microsoft Fabric Data Engineer Associate DP-700 Practice Question

A nightly Microsoft Fabric pipeline loads a Parquet file to a bronze folder and then upserts data into a silver Delta Lake table named Customers. The file may repeat customer_id values because of late updates or replayed files. You need the silver table to keep only the newest updated_at row per customer_id and allow safe re-runs without new duplicates. Which approach should you use?

  • A Spark notebook that reads the file, writes it to the Customers Delta table in append mode, and then runs OPTIMIZE ZORDER BY (updated_at).

  • A Spark notebook that executes a Delta Lake MERGE INTO Customers USING the nightly DataFrame ON customer_id, updating the row only when the incoming updated_at value is greater and inserting otherwise.

  • A Spark notebook that calls dropDuplicates("customer_id") on the DataFrame and overwrites the Customers table on each load.

  • A Data Factory copy activity that writes the file to the lakehouse with the preserveHierarchy option set to true and skipDuplicates enabled.

Microsoft Fabric Data Engineer Associate DP-700
Ingest and transform data
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot