GCP Professional Cloud Architect Practice Question
A genomics research group wants to train a 180-billion-parameter transformer model on Google Cloud. Their code includes several custom CUDA kernels that are not compatible with XLA, and internal benchmarks show best throughput on NVIDIA H100 GPUs interconnected with NVSwitch. The scientists will orchestrate experiments with Vertex AI Pipelines and need to scale to multiple hosts that each provide 200 Gbps of network bandwidth for fast parameter exchange. Which solution should the cloud architect recommend?
Run the workload as a Vertex AI training job on Cloud TPU v4-32 pod slices to achieve petaflop-scale BF16 performance.
Create Vertex AI custom training jobs that use the A3 machine series (8 × H100 80 GB GPUs with NVSwitch) and run multi-host distributed training across several A3 virtual machines.
Package the training code in a container and deploy it to Cloud Run on GPU with T4 accelerators, then scale the service horizontally.
Provision a GKE Autopilot cluster with NVIDIA P100 GPU node pools and execute the training using Kubeflow Pipelines.
Because the training code depends on custom CUDA kernels, the team needs hardware that exposes a standard CUDA programming environment. Cloud TPUs expose XLA rather than CUDA, and GPU limits on Cloud Run and older GPUs on P100 nodes would not meet performance or scale requirements. The A3 machine series supplies eight H100 80 GB GPUs per VM connected by NVSwitch and attached to 200 Gbps networking. Vertex AI custom training jobs can target this machine type and run distributed training across multiple A3 VMs, integrating cleanly with Vertex AI Pipelines orchestration. Therefore the A3-based custom training approach is the only option that satisfies the hardware, networking, and orchestration needs.
Ask Bash
Bash is our AI bot, trained to help you pass your exam. AI Generated Content may display inaccurate information, always double-check anything important.
What are CUDA kernels and why are they important in this context?
Open an interactive chat with Bash
What is NVSwitch and how does it enhance GPU performance?
Open an interactive chat with Bash
Why are A3 machines with H100 GPUs the preferred choice over other options?
Open an interactive chat with Bash
What is NVSwitch and how does it help in distributed GPU training?
Open an interactive chat with Bash
Why are NVIDIA H100 GPUs recommended for large-scale AI workloads?
Open an interactive chat with Bash
What are Vertex AI Pipelines and how do they support distributed training?
Open an interactive chat with Bash
GCP Professional Cloud Architect
Managing and provisioning a solution infrastructure
Your Score:
Report Issue
Bash, the Crucial Exams Chat Bot
AI Bot
Loading...
Loading...
Loading...
Pass with Confidence.
IT & Cybersecurity Package
You have hit the limits of our free tier, become a Premium Member today for unlimited access.
Military, Healthcare worker, Gov. employee or Teacher? See if you qualify for a Community Discount.
Monthly
$19.99
$19.99/mo
Billed monthly, Cancel any time.
3 Month Pass
$44.99
$14.99/mo
One time purchase of $44.99, Does not auto-renew.
MOST POPULAR
Annual Pass
$119.99
$9.99/mo
One time purchase of $119.99, Does not auto-renew.
BEST DEAL
Lifetime Pass
$189.99
One time purchase, Good for life.
What You Get
All IT & Cybersecurity Package plans include the following perks and exams .