GCP Professional Cloud Architect Practice Question

An online ticketing platform runs a stateless container-based API that sees an average load of 50 requests per second (RPS) but experiences unpredictable flash-sale surges of up to 5 000 RPS for only a few minutes at a time. The current 3-node GKE Standard cluster is sized for peak demand, resulting in very low utilization most of the day and high operational overhead. You must redesign the compute layer so that it 1) keeps p99 latency below 300 ms during spikes, 2) eliminates almost all idle capacity costs, and 3) minimizes day-to-day infrastructure management effort. Which approach best meets these requirements?

  • Replace the GKE cluster with a regional managed instance group of 30 e2-standard-4 VMs behind an external HTTP(S) load balancer and enable autoscaling at 60 % CPU utilization.

  • Keep the existing GKE cluster and enable Cluster Autoscaler, setting the node pool to scale between 1 and 30 nodes and use a Horizontal Pod Autoscaler at 70 % CPU utilization.

  • Package the API into a container, deploy it to Cloud Run with minInstances set to 0, enable CPU always-allocated = false, and rely on automatic request-based scaling for traffic spikes.

  • Convert the current nodes to preemptible VMs to lower per-instance price and disable autoscaling to prevent scale-up delays during traffic spikes.

GCP Professional Cloud Architect
Designing and planning a cloud solution architecture
Your Score:
Settings & Objectives
Random Mixed
Questions are selected randomly from all chosen topics, with a preference for those you haven’t seen before. You may see several questions from the same objective or domain in a row.
Rotate by Objective
Questions cycle through each objective or domain in turn, helping you avoid long streaks of questions from the same area. You may see some repeat questions, but the distribution will be more balanced across topics.

Check or uncheck an objective to set which questions you will receive.

Bash, the Crucial Exams Chat Bot
AI Bot