Model Lifecycle TCO

Estimate $/token and total cost across training, fine‑tuning, and inference — see how utilization, power price, and hardware choices move your unit economics.

KPIs

Token Cost ($/1k)
Unit cost per 1,000 tokens across the model lifecycle, including amortized training/fine‑tuning and inference OPEX.
Higher is worse
Lifecycle Total Cost
Total cost over the planning horizon: training + fine‑tuning + inference.
Higher is worse
Training Cost
Total cost of training runs for the next major model release.
Higher is worse
Fine‑Tuning Cost
Total cost of fine‑tuning and alignment runs for this release window.
Higher is worse
Inference Cost
Monthly OPEX of serving inference, including power, hardware amortization, and platform overhead.
Higher is worse
TCO Index
Normalized 0–1 index summarizing lifecycle total cost for scenario comparison (higher = worse).
Higher is worse

Internal Factors

Utilization Rate
Share of accelerator wall‑time spent doing useful work (active/available).
Higher is better
Power Price ($/MWh)
Blended electricity price per MWh after tariffs and hedges.
Higher is worse
Power System Overhead Index (from PUE)
Facility power overhead fraction derived from Power Usage Effectiveness (PUE); 0 = perfect, higher = worse.
Higher is worse
Accelerator Installed Base
Count of deployed accelerators available for training/inference.
Average Sequence Length
Average total tokens per request (prompt + completion).
Higher is worse
Energy Consumption
Monthly facility‑level energy usage for the AI fleet.
Higher is worse

Levers

Utilization Target
Scheduling/placement target for fleet utilization subject to SLAs.
Power Hedge Share
Share of expected load under fixed/hedged electricity pricing.
PUE Improvement CAPEX
Capital expenditure aimed at lowering facility overhead (cooling/power path).
Accelerator Generation Choice
Chosen accelerator generation for the fleet (e.g., H100, B200, MI300X) with implied perf/W and memory profile.
Model Release Cadence
Days between major model releases; ties training budget to roadmap.
Context Length Limit
Platform cap on total tokens per request to bound inference cost tails.

Unlock Benchmarks