Forces

The numbers show where the flywheel hits the physical world.

To serve larger models faster, the old rules about compute keep breaking. New paradigms get adopted where the flywheel collides with physics, bandwidth, power, and the limits of scale.

Better hardware

Compute Density

$50K N2 wafer cost

What it indicates: Leading-edge scaling is no longer free. Each new node asks the system to pay more for smaller gains.

Illustrative example: High-NA EUV tools are roughly $400M machines. The question is no longer whether physics can print smaller features, but whether the economics justify the next turn.

Better and more

Memory & Bandwidth

22 TB/s Rubin HBM4 bandwidth

What it indicates: Inference turns memory into the binding layer: context, KV cache, bandwidth, and supply all matter at once.

Illustrative example: HBM4 is arriving at roughly 2x HBM3E pricing while consuming far more wafer capacity per bit than commodity DRAM. The old commodity memory cycle is becoming strategic infrastructure.

Better hardware

Advanced Packaging

858 mm² single-die reticle ceiling

What it indicates: When a chip cannot simply get bigger, the package becomes the computer.

Illustrative example: Blackwell already stitches two reticle-scale compute dies through advanced packaging. CoWoS capacity, hybrid bonding, substrates, and known-good-die testing become gates on AI system volume.

Better hardware

Networking & Optics

10^-12 AI-cluster bit-error target

What it indicates: The cluster is only as fast as the fabric that lets thousands of accelerators act like one machine.

Illustrative example: At 200G PAM4, copper and short-reach optics run into physical limits. Rubin Ultra-class systems imply thousands of optical modules per rack-scale system, pushing the market toward silicon photonics and co-packaged optics.

More hardware

Power & Cooling

120 kW GB200 NVL72 rack

What it indicates: AI racks are becoming industrial electrical loads, not ordinary data-center equipment.

Illustrative example: Legacy racks were often 5-20 kW. NVIDIA's Kyber reference points toward 600 kW racks, where 48V copper becomes unworkable and 800V distribution starts to look necessary.

More hardware

Data Center Scale

1 GW frontier-cluster scale

What it indicates: GPU supply is no longer the only pacing item. Power, land, permitting, cooling, and execution speed now set the ceiling.

Illustrative example: xAI's Colossus build showed that construction speed can become a moat. At the same time, interconnect queues measured in years can slow even well-capitalized buyers.

Better and more

Inference Distribution

24x modeled token growth by 2030

What it indicates: The workload is shifting from occasional training runs to continuous, latency-sensitive use.

Illustrative example: Agentic systems reuse context, generate long traces, and make caching valuable. The relevant metric shifts from cost per token toward cost per completed task.

Value migration

Application Migration

$13T US labor pool at stake

What it indicates: As intelligence gets cheaper, value migrates toward workflows where software can absorb work instead of merely supporting it.

Illustrative example: Call centers, coding teams, IT access requests, and governed enterprise search already show the pattern: the value is not just cheaper tokens, but work moving into new operating systems.