00:20:00

Microsoft Fabric Data Engineer Associate Practice Test (DP-700)

Use the form below to configure your Microsoft Fabric Data Engineer Associate Practice Test (DP-700). The practice test can be configured to only include certain exam objectives and domains. You can choose between 5-100 questions and set a time limit.

Logo for Microsoft Fabric Data Engineer Associate DP-700
Questions
Number of questions in the practice test
Free users are limited to 20 questions, upgrade to unlimited
Seconds Per Question
Determines how long you have to finish the practice test
Exam Objectives
Which exam objectives should be included in the practice test

Microsoft Fabric Data Engineer Associate DP-700 Information

The Microsoft Fabric Data Engineer Associate (DP-700) exam shows that you know how to work with data in Microsoft Fabric. It tests your ability to collect, organize, and prepare data so it can be used for reports and dashboards. Passing the DP-700 means you can build and manage data pipelines, use tools like Power BI and Azure Synapse, and make sure data is clean and ready for analysis.

This exam is best for people who already have some experience working with data or databases and want to move into a data engineering role. If you enjoy working with numbers, building reports, or using SQL and Python to manage data, this certification can help you stand out to employers. It’s designed for anyone who wants to show their skills in data handling using Microsoft tools.

Before taking the real exam, it’s smart to use DP-700 practice exams, practice tests, and practice questions to prepare. These tools help you get used to the types of questions you’ll see on test day and show which topics you need to study more. By using practice tests often, you can build confidence, improve your score, and walk into the exam knowing what to expect.

Microsoft Fabric Data Engineer Associate DP-700 Logo
  • Free Microsoft Fabric Data Engineer Associate DP-700 Practice Test

  • 20 Questions
  • Unlimited
  • Implement and manage an analytics solution
    Ingest and transform data
    Monitor and optimize an analytics solution

Free Preview

This test is a free preview, no account required.
Subscribe to unlock all content, keep track of your scores, and access AI features!

Question 1 of 20

You manage a Microsoft Fabric workspace that contains a lakehouse used by finance analysts. The analysts frequently export tables to Excel and then email the files outside the organization. Management requests that every exported Excel file be automatically encrypted and marked "Confidential - Finance" without requiring the analysts to take any additional steps. Which action will meet this requirement with the current capabilities of Microsoft Fabric?

  • Set a default sensitivity label for the entire workspace so that all contained items inherit the label and its protection settings automatically.

  • Enable Azure Information Protection on the OneLake storage account and configure an encryption policy that targets the finance folder.

  • Apply the "Confidential - Finance" Microsoft Purview sensitivity label directly to the lakehouse item in the Fabric workspace.

  • This requirement cannot be met because Fabric lakehouse exports do not yet support sensitivity-label-based encryption or content marking.

Question 2 of 20

During a scheduled refresh of a Fabric Dataflow Gen2, the operation fails with the message "Expression.Error: The column CustomerID of the table was not found." An upstream SQL view was recently modified and the column was renamed to CustID. You need to restore a successful refresh without recreating the dataflow. What should you do in the dataflow editor?

  • Recreate the gateway connection using basic authentication credentials.

  • Enable the dataflow's Enhanced Compute Engine and re-run the refresh.

  • Edit the Power Query steps, replace references to CustomerID with CustID in the Source or subsequent transformation steps, then save and refresh.

  • Increase the dataflow refresh timeout value in Settings and retry.

Question 3 of 20

CSV files land in Azure Blob Storage, exposed in OneLake via a shortcut. You must build a Fabric solution that automatically starts when each new file arrives, passes the file path to an existing PySpark notebook that cleans the data, and loads the output into a Fabric warehouse. The solution must include event triggers, parameterized notebook input, and built-in retry and alerting. Which Fabric component should you create?

  • Create a Dataflow Gen2 to transform the data and load it into the warehouse.

  • Create a Data Factory pipeline that contains an Execute Notebook activity followed by a Copy Data activity.

  • Create a SQL pipeline in the Fabric warehouse to run the PySpark logic and load the data.

  • Create a Spark notebook and schedule it to run on a frequent interval.

Question 4 of 20

Your team develops a Microsoft Fabric lakehouse solution. You must store the definitions of all SQL objects (tables, views, and stored procedures) in a Git repository and automate schema deployments to test and production workspaces by using Azure DevOps pipelines. Which Fabric workspace item should you create to meet these requirements?

  • Export the lakehouse tables as Delta files and commit the files to the repository.

  • Enable Git integration for the workspace without adding any additional artifacts.

  • Author a notebook that contains CREATE TABLE and ALTER statements and store it in Git.

  • Create a database project in the workspace and connect it to the Git repository.

Question 5 of 20

In a Microsoft Fabric PySpark notebook, you have a DataFrame named df that contains incremental changes for the Customers dimension. You must write the data to the lakehouse path "Tables/dim_customer" so that it is stored in Delta format, automatically merges any new columns in future loads, and is physically partitioned by the Country column. Which PySpark write command meets all these requirements?

  • df.repartition("Country").write.format("delta").mode("append").save("Tables/dim_customer")

  • df.write.format("delta").mode("overwrite").option("overwriteSchema", "true").partitionBy("Country").save("Tables/dim_customer")

  • df.write.format("parquet").mode("append").option("mergeSchema", "true").partitionBy("Country").save("Tables/dim_customer")

  • df.write.format("delta").mode("append").option("mergeSchema", "true").partitionBy("Country").save("Tables/dim_customer")

Question 6 of 20

A stored procedure in a Microsoft Fabric warehouse runs this statement to upsert rows from StgSales into DimCustomer:

MERGE dbo.DimCustomer AS tgt
USING dbo.StgSales AS src
    ON tgt.CustomerID = src.CustomerID
WHEN MATCHED THEN
    UPDATE SET tgt.City = src.City, tgt.Region = src.Region
WHEN NOT MATCHED BY TARGET THEN
    INSERT (CustomerID, City, Region)
    VALUES (src.CustomerID, src.City, src.Region);

Execution fails with the error: "The MERGE statement attempted to UPDATE or DELETE the same row more than once. A target row matched more than one source row."
You must correct the T-SQL so the procedure succeeds while still performing the required updates and inserts.

Which change should you make to the statement?

  • Execute SET IDENTITY_INSERT dbo.DimCustomer ON immediately before running the MERGE.

  • Rewrite the USING clause to select DISTINCT CustomerID, City, Region from dbo.StgSales before the MERGE is executed.

  • Replace the MERGE with an INSERT statement that uses the ON ERROR clause to ignore conflicts.

  • Add the table hint WITH (NOLOCK) to dbo.StgSales in the USING clause.

Question 7 of 20

You need to give analysts in Workspace A access to a set of parquet files that already reside in Workspace B's lakehouse without copying the data. You plan to create a shortcut in Workspace A that references the existing folder. Which statement about managing shortcuts in Microsoft Fabric meets the requirement?

  • A shortcut can only be created at the root of the lakehouse; it cannot be placed inside sub-folders.

  • A shortcut relies on the source folder's permissions, so analysts must already have read access to the folder in Workspace B.

  • You must schedule a pipeline refresh for the shortcut; otherwise, newly added files will not appear to analysts.

  • After you create the shortcut, renaming the source folder is automatically reflected and requires no additional action.

Question 8 of 20

You manage a Microsoft Fabric lakehouse that ingests micro-batches into a Delta table named Sales. After several months, the table contains thousands of very small data files, and analysts report that queries filtering on the OrderDate column now take much longer to finish. With minimal code changes, which action should you perform to most effectively improve scan and query performance on the Sales table?

  • Execute OPTIMIZE Sales ZORDER BY (OrderDate); to compact small files and cluster rows on the filter column.

  • Run VACUUM Sales RETAIN 0 HOURS; to delete obsolete data files from the table.

  • Use COPY INTO to export the data to a single large Parquet file and replace the table.

  • Increase the notebook session's driver memory to provide more Spark cache capacity during queries.

Question 9 of 20

You are designing a real-time telemetry solution in Microsoft Fabric. Business analysts must be able to query incoming sensor readings together with the last three months of historical data by using the same syntax that they currently use in Azure Data Explorer. The solution must offer built-in windowing, time-series functions, and automatic data retention without requiring you to write code. Which streaming engine should you choose?

  • Deploy an Azure Stream Analytics job that writes to a Fabric Data Warehouse.

  • Develop a Spark Structured Streaming notebook that writes the data to Delta tables in a lakehouse.

  • Use Eventstream only to capture the data to a lakehouse and query it later with SQL.

  • Create a KQL database in Real-Time Analytics and ingest the stream directly into a KQL table.

Question 10 of 20

You manage a Microsoft Fabric warehouse. A 4-TB sales fact table is currently stored using the default clustered columnstore index and a ROUND_ROBIN distribution. Most analytical queries join this table with a 15-GB Date dimension and filter on the calendar year. The queries scan many unnecessary rows and exhibit high data movement during joins. Without purchasing additional capacity, which change is most likely to reduce both scan and shuffle costs?

  • Replicate the fact table instead of distributing it so all compute nodes have a local copy.

  • Convert the fact table to a clustered rowstore index to improve predicate pushdown on DateKey.

  • Redistribute the fact table by using HASH on the DateKey column while keeping the clustered columnstore index.

  • Create a materialized view that filters the fact table to the current calendar year.

Question 11 of 20

You are building a Microsoft Fabric Eventstream to ingest sensor readings from Azure IoT Hub. Each event payload contains the fields deviceId (string), ts (epoch milliseconds), temperature (double), and humidity (double).

You must satisfy the following processing requirements:

  • Guarantee that statistics are calculated by the timestamp in each event, even if messages arrive out of order.
  • Discard any event that arrives more than 2 minutes after its ts value.
  • Produce a running 1-minute tumbling-window average of the temperature for each device and store the result in a Real-Time Analytics KQL database table.

Which configuration should you apply to the Eventstream input or query to meet all the requirements?

  • Use the system column _arrivalTime for windowing, add a WHERE clause that filters events older than 2 minutes, and write results to the KQL table every minute.

  • Mark ts as the event-time column and use a 1-minute hopping window with a 30-second hop size; do not configure out-of-order tolerance because tumbling windows implicitly drop late data.

  • Leave event ordering at the default arrival time, and in the query declare a 1-minute session window on ts; set the session timeout to 2 minutes to ignore late events.

  • Mark ts as the event-time column on the IoT Hub input, set a 2-minute out-of-order tolerance with the late-arrival policy set to Drop, and in the query use a 1-minute tumbling window on ts with GROUP BY deviceId.

Question 12 of 20

Queries that join the SalesFact and ReturnsFact tables in your Microsoft Fabric warehouse frequently spill into a high-shuffle data-movement step and run for several minutes. Both tables currently use ROUND_ROBIN distribution and each contains over 500 million rows. You must reduce data movement and accelerate the join without replicating either table. Which T-SQL action should you take?

  • Rebuild both tables as clustered columnstore indexes ordered by the join column.

  • Create nonclustered B-tree indexes on the join column in both tables.

  • Enable result-set caching for the warehouse.

  • Rebuild both tables by using HASH distribution on the shared join column.

Question 13 of 20

Your organization uses Microsoft Fabric workspaces backed by a capacity in the F64 SKU. As a Fabric administrator, you need to set up a near real-time feed of all workspace-level activities-such as item creation, permission changes, and publish operations-so that your Security Operations Center (SOC) can query the data with Kusto Query Language (KQL) and build custom alerts. Which action should you perform first to meet this requirement?

  • Download the Power Platform admin audit log and schedule a notebook to upload the file to a Kusto database.

  • Assign the Log Analytics Contributor role to the SOC analysts on the Fabric workspace.

  • Create a deployment pipeline and enable its workspace usage metrics dataset for the SOC team.

  • Create an Azure Monitor diagnostic setting on the Fabric capacity or workspace and send the Fabric Activity log to a Log Analytics workspace.

Question 14 of 20

Your team must put a Microsoft Fabric workspace under source control with an Azure DevOps Git repository named FabricData, which already has main and develop branches. Requirements: workspace commits default to develop, JSON files stored in a folder matching the workspace name, and engineers must pull and push within the Fabric portal. Which first action should you take in the Fabric portal?

  • Enable workspace-level audit logging and specify the FabricData repository as the log storage location.

  • Connect the workspace to the FabricData repository in Workspace settings and set the default branch to develop with a folder path that matches the workspace name.

  • Change the default branch in Azure DevOps from main to develop and then clone the repository locally.

  • Create a deployment pipeline that uses the FabricData repository as the source environment and assigns the develop branch to the test stage.

Question 15 of 20

You manage a Microsoft Fabric workspace that contains a data warehouse named SalesDW. Your DevOps team wants every table, view, and stored procedure in the warehouse to be stored in a Git repository so that schema changes can be reviewed and deployed through an Azure DevOps pipeline to test and production workspaces. What is the most appropriate first step to create a deployable artifact that captures the current warehouse schema?

  • Use Visual Studio Code with the SQL Database Projects extension to import SalesDW and build a .dacpac file.

  • Export SalesDW from the Fabric portal as a Power BI project (.pbip) and push it to the Git repository.

  • Generate an Azure Resource Manager (ARM) template for the warehouse item from the Azure portal and store it in Git.

  • Execute a T-SQL BACKUP DATABASE command in a Fabric notebook and add the backup file to source control.

Question 16 of 20

You are creating a new Microsoft Fabric workspace that will host several lakehouses for a sales analytics project. To comply with organizational policies, you must ensure that:

  • All files written to OneLake in this workspace are stored in the West Europe region.
  • Workspace consumers must be able to read and query data, but they must NOT be able to download the underlying parquet or CSV files from OneLake.

Which combination of settings should you configure in the OneLake section of the workspace settings to meet both requirements?

  • Enable OneLake shortcuts and set the Default storage location to West Europe.

  • Change the default file format to Delta and enable personal workspaces only.

  • Set Default storage location to West Europe and disable the Download files from OneLake toggle.

  • Disable the V-Order optimization option and enable item-level security.

Question 17 of 20

In Microsoft Fabric, you need to allow business analysts to mark their own lakehouses as Promoted while ensuring that only members of a central data governance team can apply the Certified badge to any item. Which approach meets these requirements?

  • Add the governance team to the tenant's certification security group and keep the default endorsement settings that allow any contributor to promote their own items.

  • Disable item promotion at the tenant level and grant the governance team workspace Admin rights to certify selected items.

  • Create a custom role in each workspace that alone can apply both Promoted and Certified badges to items.

  • Make all business analysts workspace Admins and require Admin permission to certify or promote items.

Question 18 of 20

A Fabric Eventstream ingests JSON telemetry from Azure Event Hubs and writes the data to a Delta table in a Lakehouse. After a device firmware update, the incoming payload now contains an additional field named "pressure". In the Eventstream monitoring pane you notice that events are being dropped with a "column count mismatch" error on the Lakehouse output. You must allow the new field to be stored without losing events or recreating the table. Which action should you take?

  • Modify the Eventstream transformation to exclude the "pressure" field from the SELECT statement.

  • Run a Spark SQL ALTER TABLE command in the Lakehouse to add a "pressure" column with the appropriate data type.

  • Enable the Auto create table option on the Lakehouse output so the table is regenerated with the new column.

  • Increase the Lakehouse output batch interval to give the service more time to process larger events.

Question 19 of 20

You manage a Microsoft Fabric Warehouse named SalesDW in a workspace. The Orders table contains a SensitiveAmount column that must be completely hidden from all analysts except members of the Azure AD group FinanceLeads. Analysts still need to query every other column in the table. Which approach meets this requirement with minimal ongoing administration?

  • Create a row-level security policy that filters SensitiveAmount for non-FinanceLeads users.

  • Create a database role for regular analysts, DENY SELECT on SensitiveAmount to that role, and GRANT SELECT on Orders to FinanceLeads.

  • Hide SensitiveAmount with object-level security in a semantic model and have analysts query through that model only.

  • Apply dynamic data masking to SensitiveAmount and grant SELECT on Orders to all analysts.

Question 20 of 20

You manage a Fabric Eventstream that ingests JSON telemetry from Azure IoT Hub and routes the data to an Eventhouse table. After a recent device firmware update, the Eventstream Monitoring dashboard shows a rapid increase in the "Failed to write events" metric for the Eventhouse output, while the "Input events" metric remains steady. Which action should you take first to identify the root cause of the failures?

  • Examine the rejected events in the Eventhouse destination's error store (dead-letter folder).

  • Delete and recreate the Eventhouse output with the "Auto create table" option enabled.

  • Refresh the Eventstream input schema to force automatic column mapping.

  • Stop and restart the Eventstream to clear transient write errors.