Mentatcurated
Artificial Intelligence medium · first-party

Halving the AI bill

Coinbase says it cut its internal AI spend by about half while usage kept climbing — by fixing plumbing, not by telling staff to use AI less.

Brian Armstrong, Coinbase's CEO, posted the company's internal cost-cutting playbook this week, and the part worth noting is what it isn't: there were no usage caps. Token consumption kept growing while the bill, by the company's own account, fell by roughly half.

The limiting factor will be energy and compute, not better models. — Brian Armstrong

The savings came from four pieces of plumbing. Set cheaper open-weight models as the default while letting engineers override. Route each request to the model that already has the answer cached or charges least. Optimize the context sent. And actually cache: the company says a proper caching fix took the hit rate on its internal chat tool from 5% to 60%, meaning most requests now reuse work already paid for. In an internal survey, 91% of employees said they noticed no change in their AI access — the point being that the money was wasted on infrastructure, not on people using the tools.

Two of the named defaults are worth pausing on: GLM 5.2 and Kimi 2.7, open-weight models from Chinese labs, chosen by a US-listed public company as good enough for the bulk of routine work. The quiet signal under the cost story is where open weights now sit on the price-versus-quality line — close enough that a frontier-spending firm reaches for them first.

Every figure here is self-reported and unaudited, so treat the exact halving with care. But the shape is the lesson for any company watching its AI bill grow: the fix is the gateway and the cache, not a memo asking people to ration.

The lenses

Novelty 3
Impact · breadth 4
Impact · depth 3
Actionable 2
Substance 2
Hype 3

The facts

The cutInternal AI spend roughly halved while usage grew; no usage caps (self-reported)
The leverCache hit rate on its internal chat tool went from 5% to 60%
DefaultsCheaper open-weight models (GLM 5.2, Kimi 2.7) set as defaults; engineers can override
Felt impact91% of employees reported no change in their AI access
Open digg.com →

How this connects

Tap a node to open it