Cloud Capacity & Region Planning
Prioritize regions and capacity by aligning latency SLOs, demand forecasts, power availability, and chip supply to minimize cost of scale.
KPIs
Latency SLO Attainment Share
Share of traffic or minutes meeting the latency SLO across the region mix.
Higher is better
Capacity Deficit Index
Normalized shortfall of available capacity vs. demand forecast plus provisioning buffer.
Higher is worse
Effective Capacity Cost per GPU‑hour
Blended $/GPU‑hour from on‑demand, reserved, spot, and unused commitment effects.
Higher is worse
Multi‑region Resilience Index
Probability‑scaled index of serving demand under AZ/region failure scenarios and observed failover success.
Higher is better
Compliance Coverage Share
Portion of demand served from regions meeting data‑residency/regulatory constraints for the workload.
Higher is better
Egress Cost per GB
Effective $/GB for inter‑region/Internet data transfer under the current region plan.
Higher is worse
Power Availability Lead Time (months)
Expected months to secure incremental MW capacity in target regions (provider + interconnect milestones).
Higher is worse
Chip Supply Alignment Index
Fit between GPU supply (deliveries/allocations) and planned demand across regions and priority tiers.
Higher is better
Internal Factors
GPU Allocation Inventory
GPUs available for allocation (delivered and commissioned but not yet assigned).
Higher is better
Capacity Request Backlog (GPU)
Outstanding GPU capacity requests awaiting provider approval/quota increase.
Power‑permit Queue Backlog (MW)
MW awaiting permitting/interconnect approval in targeted regions/campuses.
Savings Plan Unused Commitment (USD)
Cumulative unused Savings Plan commitment within the current accrual period.
Obfuscated preview — sign in to view exact values
USD (millions)
RI Unused GPU‑hours
Purchased reserved GPU‑hours not applied to usage in the period.
Workload Placement Backlog (requests)
Pending placement requests waiting for region assignment given constraints.
Levers
RI/Savings Plan Coverage Target Share
Target fraction of compute cost/hours to cover with RIs or Savings Plans.
Redundant Regions Count
Number of active regions provisioned for failover (N‑way).
Traffic Steering Aggressiveness Index
Degree to which routing favors best‑latency/lowest‑cost regions among eligible choices.
Data Locality Enforcement Index
Strictness of data‑residency enforcement in placement and routing.
Provisioning Buffer Target Share
Planned headroom over forecast to absorb demand and lead‑time variance.
