Microsoft Fabric Data Engineer Associate DP-700 Practice Question

A Fabric lakehouse contains a Delta table named FactSales with 400 million rows. A Spark notebook joins this table to a lookup DataFrame that is created from a 5 000-row dimension table and then aggregates the result. The join currently runs for about 10-12 minutes and shows heavy shuffle activity in Spark UI. You cannot increase the session's compute resources. Which change will most likely reduce the job's runtime?

  • Apply the broadcast join hint to the dimension DataFrame before performing the join.

  • Repartition the FactSales DataFrame to 5 000 partitions with the repartition() method before the join.

  • Increase the value of spark.sql.shuffle.partitions from 200 to 10 000.

  • Convert both tables to CSV files and read them with spark.read.csv to avoid Delta overhead.

Microsoft Fabric Data Engineer Associate DP-700
Monitor and optimize an analytics solution
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot