Emergent Abilities in LLMs

As LLMs grow larger, most of their skills improve smoothly. But some abilities behave differently: they're absent in smaller models and then appear suddenly once a model crosses a certain size. These are emergent abilities — capabilities that seem to "switch on" at scale, often without anyone explicitly designing them. They're one of the most fascinating — and debated — aspects of LLM behaviour.

💡 In one line: Emergent abilities are skills that appear abruptly once a model gets big enough — not present in smaller models, and hard to predict.

What Are Emergent Abilities?

An emergent ability is a capability not present in smaller models that appears abruptly once a model reaches a certain scale (in parameters, data, or compute). On such a task, performance stays near random as the model grows — until a threshold, after which it jumps to a high level.

This is different from the smooth improvements described by scaling laws. Here the curve is sharp, not gradual.

Examples

Abilities often reported as emergent include:

  • Multi-step reasoning (chain-of-thought).
  • Multi-digit arithmetic.
  • In-context / few-shot learning.
  • Following complex instructions.
  • Translating low-resource languages.
  • Code generation.

Many of these simply don't work in small models and only become reliable past a certain scale.

Why They're Surprising

Scaling laws predict a smooth decrease in loss. Emergent abilities seem to break that pattern with sharp, qualitative jumps on specific tasks. The unsettling part: you often can't predict which abilities will appear, or exactly when. A bigger model can do things its creators didn't explicitly plan for.

The Debate: Real or a Mirage?

There's an important nuance here, and researchers genuinely disagree:

  • One view: emergence is real — certain capabilities genuinely appear only at scale, marking true qualitative shifts.
  • Another view: some apparent emergence is partly a measurement artifact. With all-or-nothing metrics (like exact-match), a gradual underlying improvement can look like a sudden jump. Measured with smoother metrics, the same ability often improves continuously.

The takeaway: "emergence" sometimes reflects how we measure as much as what the model does. Both perspectives are worth keeping in mind.

Relation to Scaling Laws

  • Scaling laws describe the smooth, predictable drop in overall loss.
  • Emergent abilities are the exception — specific capabilities that appear sharply.

They're two lenses on the same phenomenon of scaling.

Why It Matters

  • Unpredictability — it's hard to know in advance what a bigger model will be able to do.
  • Capability and safety — new abilities (helpful or risky) can appear unexpectedly, which is a major reason models are evaluated carefully before deployment.
  • Motivation to scale — the promise of new abilities helped drive the race to larger models.

Summary

  • Emergent abilities appear abruptly at a certain model scale, absent in smaller models.
  • Examples include reasoning, arithmetic, in-context learning, and instruction following.
  • They contrast with the smooth improvements of scaling laws — though some apparent emergence may be a measurement artifact.
  • They make a bigger model's capabilities hard to predict.
  • This unpredictability matters for both capability and safety, motivating careful evaluation.