Concept·AI Models & Capabilities·Added 2 months ago

Jagged intelligence

Also known as: jagged capability profile, jagged AI, jagged frontier

The uneven, spiky capability profile of large language models. An LLM can refactor a 100,000-line codebase and miss basic common sense in the same session. The frontier is jagged: exceptionally capable within certain trained domains, surprisingly brittle just outside them.

Andrej Karpathy popularized the term at Sequoia's AI Ascent event in April 2026, but the observation had been building in the builder community for longer. The image he used: the same model that can find zero-day vulnerabilities in operating systems might give you nonsensical directions to a car wash nearby. The capability frontier is not a smooth curve. It has peaks where reinforcement learning training has concentrated on verifiable, high-reward tasks, and valleys where the model is surprisingly weak.

The mechanistic reason is RL (reinforcement learning) training circuits. Models improve dramatically in domains where outputs are automatically verifiable: math, code, logic puzzles, game-playing. If your use case falls inside one of those reward circuits, you get near-superhuman performance. If it falls outside, you can get confident nonsense. This is why benchmark scores can be misleading: a model that scores top-percentile on SWE-Bench (a software engineering benchmark) may still fail comically on a simple spatial reasoning task.

For builders, jagged intelligence has two practical implications. First, it's a debugging lens: when a model fails on your use case, the question isn't always 'is this model bad?' but 'is this task outside the verifiable circuits that training emphasized?' That diagnosis changes the fix, whether that's better prompting, retrieval, fine-tuning, or a human review step. Second, it's a product design principle: stack agents on the peaks of the jagged frontier, not in the valleys. The most reliable AI products route tasks to the domains where the model's capability is actually high.

This definition is AI-generated and refreshed weekly. It may contain inaccuracies. Use your own judgment, especially for production decisions.

Related terms