April 2, 2026AnthropicinferenceneocloudsKAIROSinfrastructure

The Anthropic Leak and the Daemon Tax: What KAIROS Means for Inference Economics

On March 31, Anthropic accidentally shipped 512,000 lines of Claude Code's unobfuscated source to npm. Inside: a daemon called KAIROS that converts every seat into persistent inference. The cost model that underpins the GPU infrastructure buildout just changed.

KAIROS daemon process: wake / evaluate / sleep cycle with 15-second tick intervals. 512K lines leaked, 1.3-20x cost range, 90.6% CPU bottleneck, $19B ARR.

TLDR

KAIROS is a daemon that wakes every 15 seconds and consolidates memory overnight. Every Claude Code seat becomes a persistent compute process, not a session. The infrastructure stack was priced for bursty inference. Daemons are neither.

Cost range per seat: 1.3x at the floor (daemon mostly idle), 20x at the ceiling (high-agency mode). Agentic workloads are CPU-bottlenecked, not GPU-bottlenecked. Neoclouds built GPU farms. The agent runtime layer doesn't exist yet. The request-response inference model is ending.

AI Briefly Pro | Deep Dive | ~5,000 words | April 2, 2026

What Actually Leaked
KAIROS
Anthropic's Chip Strategy: Co-Design Everything, Own Nothing
How Much Does the Daemon Actually Cost
Where the Tokens Go (CPU Bottleneck)
The Subsidy Problem
How Do You Price an Always-On Agent
Everyone Wants Always-On AI. Nobody Has One.
The Neocloud Problem
What Breaks This
How to Position

Disclosure: I previously worked at CoreWeave and Google. AI Briefly contributors may hold positions in companies mentioned. Not investment advice.

The Short Version

KAIROS is a daemon that wakes every 15 seconds and consolidates memory overnight. Every Claude Code seat becomes a persistent compute process, not a session. Rollout is gated for May 2026, employees first.
The infrastructure stack was priced for bursty, statistically multiplexable inference. Daemons are neither. Even a mostly-sleeping daemon forces capacity planning from overcommit to reserved guarantee - different business, different margins.
Cost range per seat: 1.3x at the floor (daemon mostly idle), 20x at the ceiling (high-agency mode running hot during work hours). The base case scenarios between those bounds are what nobody has modeled yet.
Agentic workloads are CPU-bottlenecked. Georgia Tech and Intel measured it - 90.6% of agent latency sits on CPU-side tool processing (arXiv:2511.00739). Nvidia shipped a dedicated agentic CPU at GTC. Neoclouds built GPU farms.
Anthropic isn't building a chip. It's co-designing Trainium with Annapurna, bought 400K TPUv7s outright from Broadcom, and keeps Nvidia as the third leg. Better play than a custom ASIC at $19B ARR.
CoreWeave has $14B in debt tied to the wrong hardware mix.
The agent runtime layer - the operating system for persistent AI - doesn't exist yet. Hyperscalers will get there. The window for someone else to own it is 12-24 months.
The request-response inference model that the entire infrastructure stack was sized and priced against is ending.

How to Position

Long: Anthropic (pre-IPO, $19B ARR, daemon-first architecture). Nvidia Vera CPU and AMD EPYC (CPU bottleneck reprices hardware mix). Nebius (closest to owning the agent runtime layer).

Short / caution: GPU-only rental neoclouds without software layers. CoreWeave at current multiples if BMaaS margins stay at 14-16%. Any company whose unit economics are built on current subsidized API rates.

Related Deep Dives

April 14, 2026Pro

July 24 Is Not Priced Into AI Infrastructure

The 2026 hyperscaler capex bridge carries $13 to $38 billion of unpriced tariff cost that did not appear in any Q4 earnings call. Section 122 expires on July 24, but the duties do not come off - only the legal authority collecting them rotates. The 60 to 90 day handoff window is the tradeable event.

tariffsSection 122Section 232data-centershyperscalerscapex

March 31, 2026Pro

The Decode Tax: Who Survives When Custom Silicon Demands Hyperscaler Scale

Nvidia spent $20B on Groq to fix the part of inference where GPUs waste 99.8% of their silicon. The distribution surface question is settled. The open question is who captures margin in the repricing.

ASICinferenceNvidiaBroadcomTPU