Microsoft Azure AI Engineer Associate AI-102 Practice Question

Your team runs a prompt flow in Azure AI Foundry that calls an Azure OpenAI deployment named "chat-prod". The deployment currently uses model gpt-35-turbo-1106 and often reaches its tokens-per-minute quota during peak hours. You also need to move to the newer gpt-35-turbo-0125 model without breaking any running clients. Which approach best meets both requirements of higher scalable throughput and a zero-downtime model update?

Edit the existing deployment to change only the model name to gpt-35-turbo-0125 and rely on Azure OpenAI to autoscale tokens-per-minute limits automatically.
Delete the current "chat-prod" deployment, recreate it with gpt-35-turbo-0125, and then ask Microsoft Support for a higher quota on the new deployment.
Provision another Azure OpenAI resource in a different region, deploy gpt-35-turbo-0125 there, and hard-code your application to send half of the requests to the new endpoint.
Create a second deployment in the same Azure OpenAI resource that uses gpt-35-turbo-0125, request the additional tokens-per-minute quota for it, and gradually switch traffic from "chat-prod" to the new deployment before deleting the original one.

Microsoft Azure AI Engineer Associate AI-102

Implement generative AI solutions

Your Score:

Settings & Objectives

Plan and manage an Azure AI solution

Implement generative AI solutions

Implement an agentic solution

Implement computer vision solutions

Implement natural language processing solutions

Implement knowledge mining and information extraction solutions

Random Mixed

Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.

Rotate by Objective

Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Hide Score

Check or uncheck an objective to set which questions you will receive.

Wipe Score

Bash, the Crucial Exams Chat Bot

AI Bot

Microsoft Azure AI Engineer Associate AI-102 Practice Question

Answer Description

Ask Bash

Why is creating a second deployment preferred over modifying the existing one?

What are the benefits of requesting a higher tokens-per-minute quota during a model update?

How can you gradually redirect traffic between deployments in Azure OpenAI?

Monthly

$19.99

Billed monthly,
Cancel any time.

3 Month Pass

$44.99

One time purchase of $44.99,
Does not auto-renew.

Annual Pass

$119.99

One time purchase of $119.99,
Does not auto-renew.

Lifetime Pass

$189.99

One time purchase,
Good for life.

All Exams

Unlimited Tests

Unlimited Questions

AI Tutor

Track scores

Report Cards

Voucher Discounts

Advanced PBQs

Included Exams

Microsoft Azure AI Engineer Associate AI-102 Practice Question

Report Issue

Answer Description

Ask Bash

Why is creating a second deployment preferred over modifying the existing one?

What are the benefits of requesting a higher tokens-per-minute quota during a model update?

How can you gradually redirect traffic between deployments in Azure OpenAI?

Report Issue