LM Link
A preview of the LM Studio app pooled four Mac Studios into a 1.5-terabyte machine, ran a trillion-parameter model on it, and answered from an iPhone — no datacenter, no cloud.
At WWDC this month, four Mac Studios sat on a desk and, stitched together, ran Kimi K2.6 — a trillion-parameter model whose weights alone need about a terabyte, more than any single machine holds. A MacBook and then an iPhone queried it live over an encrypted link. The whole thing drew a fraction of the power a comparable rack of datacenter GPUs would.
The fast path needs the machines wired into a low-latency mesh; the simpler cabling mode, Apple noted, "doesn't speed up inference" and "isn't supported by all models." — Apple WWDC26 session 233
The clustering isn't LM Studio's trick, and that's worth being clear about. Apple shipped the hard part in macOS this cycle: a way to fan one model's math across several Macs over the Thunderbolt cables between them, fast enough that they behave like one big pooled memory. Apple's own WWDC session demoed the identical model on the identical four machines without ever mentioning LM Studio. Splitting a 671-billion-parameter model across a Mac cluster was already being done by hobbyists last year.
What LM Studio added is the last mile — the part that turns a research capability into something you use. It wrapped Apple's plumbing in the consumer app people already run their local models in, and bolted on remote access: your cluster stays on your desk, but you can talk to it from your phone across town, end to end encrypted. The reason to want that isn't the trillion-parameter headline. Kimi K2.6 is the first open-weights model to top a hard coding benchmark that had belonged to the closed labs. Running that one, privately, on hardware you own — rather than renting it through someone else's API — is the point the demo is making.
It was a preview built with Apple, not a shipped feature, and the four-Mac cluster runs about forty thousand dollars. But the direction is the story: frontier-scale private inference is close enough to a desk-side appliance that the app you'd run it in already exists.
The lenses
The facts
How this connects
Tap a node to open it