Microsoft Azure AI Engineer Associate AI-102 Practice Question
Your team runs a prompt flow in Azure AI Foundry that calls an Azure OpenAI deployment named "chat-prod". The deployment currently uses model gpt-35-turbo-1106 and often reaches its tokens-per-minute quota during peak hours. You also need to move to the newer gpt-35-turbo-0125 model without breaking any running clients. Which approach best meets both requirements of higher scalable throughput and a zero-downtime model update?
Edit the existing deployment to change only the model name to gpt-35-turbo-0125 and rely on Azure OpenAI to autoscale tokens-per-minute limits automatically.
Delete the current "chat-prod" deployment, recreate it with gpt-35-turbo-0125, and then ask Microsoft Support for a higher quota on the new deployment.
Provision another Azure OpenAI resource in a different region, deploy gpt-35-turbo-0125 there, and hard-code your application to send half of the requests to the new endpoint.
Create a second deployment in the same Azure OpenAI resource that uses gpt-35-turbo-0125, request the additional tokens-per-minute quota for it, and gradually switch traffic from "chat-prod" to the new deployment before deleting the original one.
Creating a second deployment lets you (1) request a separate or higher quota so overall tokens-per-minute capacity increases and (2) point that new deployment to the newer model version. While both deployments are active, you can gradually redirect traffic in your application or with an alias/traffic-weight setting and decommission the old deployment only after you have confirmed stability. Simply modifying or deleting the existing deployment breaks clients and offers no additional quota, and moving to a different Azure OpenAI resource or region gives extra capacity but still requires client-side changes and does not help with an in-place model swap.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
Why is creating a second deployment preferred over modifying the existing one?
Open an interactive chat with Bash
What are the benefits of requesting a higher tokens-per-minute quota during a model update?
Open an interactive chat with Bash
How can you gradually redirect traffic between deployments in Azure OpenAI?
Open an interactive chat with Bash
Microsoft Azure AI Engineer Associate AI-102
Implement generative AI solutions
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .