Mentatcurated
Artificial Intelligence medium · first-party

pxpipe

A local proxy that pastes the cold, bulky parts of a Claude request into a PNG before sending — because an image of text bills by the pixel, not the character.

A dense page of text rendered as a single image holds roughly 92,000 characters but costs the model about 4,761 vision tokens to read — a little over three characters per token, against the one character per token you pay when the same text goes as words. pxpipe, a drop-in local proxy, exploits that gap: it sits between Claude Code and Anthropic's API, screenshots the parts of each request that are big and stale (old chat history, tool output, the system prompt), and sends recent turns as ordinary text. Its maker measures a 59-70% smaller bill, and gates each request through a profitability estimator so it only images the context when imaging is cheaper.

The idea isn't new — Google's CLIPPO, the PIXEL language model, DeepSeek's optical-compression work, and an Andrej Karpathy argument last October all pointed at pixels as a denser way to feed a model than text tokens. What pxpipe adds is that it ships: a proxy you can run today against live pricing, not a paper or a benchmark. The per-token math it rests on is independently grounded; the 59-70% headline is the maker's own end-to-end figure, not a reproduced benchmark, and the repo's own docs were largely written by AI agents running behind pxpipe itself.

The catch is the reason this is a tradeoff, not a free lunch: reading text from an image is lossy, and it fails silently. Asked to recall an exact twelve-character hex string buried in dense imaged content, Claude Fable 5 got 13 of 15; Opus got 0 of 15 — and the misses weren't refusals or errors but confident wrong answers. So the same trick is roughly safe on one model and unusable on another for anything byte-exact — IDs, hashes, keys — with no signal that the read went wrong. For an agent grinding through cheap, forgiving context it's a real discount; for the parts that must be exact, keep them in words.

The lenses

Novelty 3
Impact · breadth 3
Impact · depth 3
Actionable 5
Substance 4
Hype 2

The facts

Open github.com →

How this connects

Tap a node to open it