Forces
The numbers show where the flywheel hits the physical world.
To serve larger models faster, the old rules about compute keep breaking.
New paradigms get adopted where the flywheel collides with physics,
bandwidth, power, and the limits of scale.
01
Better hardware
Compute Density
$50K N2 wafer cost
What it indicates: Leading-edge scaling is no longer free. Each new node asks the system to pay more for smaller gains.
Illustrative example: High-NA EUV tools are roughly $400M machines. The question is no longer whether physics can print smaller features, but whether the economics justify the next turn.
02
Better and more
Memory & Bandwidth
22 TB/s Rubin HBM4 bandwidth
What it indicates: Inference turns memory into the binding layer: context, KV cache, bandwidth, and supply all matter at once.
Illustrative example: HBM4 is arriving at roughly 2x HBM3E pricing while consuming far more wafer capacity per bit than commodity DRAM. The old commodity memory cycle is becoming strategic infrastructure.
03
Better hardware
Advanced Packaging
858 mm² single-die reticle ceiling
What it indicates: When a chip cannot simply get bigger, the package becomes the computer.
Illustrative example: Blackwell already stitches two reticle-scale compute dies through advanced packaging. CoWoS capacity, hybrid bonding, substrates, and known-good-die testing become gates on AI system volume.
04
Better hardware
Networking & Optics
10^-12 AI-cluster bit-error target
What it indicates: The cluster is only as fast as the fabric that lets thousands of accelerators act like one machine.
Illustrative example: At 200G PAM4, copper and short-reach optics run into physical limits. Rubin Ultra-class systems imply thousands of optical modules per rack-scale system, pushing the market toward silicon photonics and co-packaged optics.
05
More hardware
Power & Cooling
120 kW GB200 NVL72 rack
What it indicates: AI racks are becoming industrial electrical loads, not ordinary data-center equipment.
Illustrative example: Legacy racks were often 5-20 kW. NVIDIA's Kyber reference points toward 600 kW racks, where 48V copper becomes unworkable and 800V distribution starts to look necessary.
06
More hardware
Data Center Scale
1 GW frontier-cluster scale
What it indicates: GPU supply is no longer the only pacing item. Power, land, permitting, cooling, and execution speed now set the ceiling.
Illustrative example: xAI's Colossus build showed that construction speed can become a moat. At the same time, interconnect queues measured in years can slow even well-capitalized buyers.
07
Better and more
Inference Distribution
24x modeled token growth by 2030
What it indicates: The workload is shifting from occasional training runs to continuous, latency-sensitive use.
Illustrative example: Agentic systems reuse context, generate long traces, and make caching valuable. The relevant metric shifts from cost per token toward cost per completed task.
08
Value migration
Application Migration
$13T US labor pool at stake
What it indicates: As intelligence gets cheaper, value migrates toward workflows where software can absorb work instead of merely supporting it.
Illustrative example: Call centers, coding teams, IT access requests, and governed enterprise search already show the pattern: the value is not just cheaper tokens, but work moving into new operating systems.