Mentatcurated
concept also: neural scaling laws, scaling law

Scaling laws

The empirical finding that a model's loss falls predictably as you add compute, data, and parameters — turning "make it bigger" from a hunch into a forecastable curve.

In a nutshell

Scaling laws are the observation that, across orders of magnitude, a language model's error drops smoothly and predictably with more compute, more data, and more parameters. Because the curve is regular, you can forecast how good a bigger model will be before training it — which is why labs pour capital into scale. The open question is where, or whether, the curve finally bends.

Scaling laws reframed AI progress as an engineering forecast rather than a series of lucky architectures. If loss is a predictable function of compute, the rational move is to buy more compute — which is much of what the frontier labs have done.

The debate they fuel: optimists read the unbroken curve as a runway to far more capable systems; sceptics argue the benchmarks that keep improving aren't the ones that matter, and that something other than scale is missing. Both camps are watching the same graph.

Where it came from

Year2020
SourceKaplan et al. — "Scaling Laws for Neural Language Models" (OpenAI)
Why it matteredQuantified loss as a power-law in compute, data, and parameters — the empirical basis for "just add scale".

Related concepts