Recursive Self-Improvement: What Happens When AI Can Make Itself Smarter

What this threat is

Recursive self-improvement refers to an AI system's ability to modify, retrain, or redesign itself in ways that increase its own capabilities, which in turn makes it better at further improving itself. The concept isn't purely theoretical. Narrow forms of it are already happening. AI systems are being used to automate parts of AI research, to generate and test new neural network architectures, to write training code, and to optimize the training processes that produce the next generation of AI models. The question isn't whether AI participates in its own development; it does. The question is what happens if and when that participation becomes general enough that the AI is meaningfully driving the pace of its own capability growth rather than just assisting human researchers who are driving it.

The distinction between narrow and general self-improvement is important. A system that's optimized to improve performance on a specific task, say, chip design or protein folding, is doing something genuinely different from a system that can improve its general reasoning ability, its ability to learn from experience, or its capacity to pursue a wide range of goals. Current AI systems are much closer to the narrow end. But the gap between "AI assists with some parts of AI research" and "AI is able to improve its general capabilities" is a gap that the current trajectory of AI development is narrowing, even if it isn't clear how quickly.

What makes the compounding dynamic potentially alarming is how quickly small capability differences can translate into large outcome differences. If an AI system that's slightly more capable than its predecessor is better at improving itself, even a modest edge in the first iteration can produce a significantly more capable system in the second iteration, which produces an even larger edge in the third, and so on. This is the "intelligence explosion" scenario that researchers have discussed for decades: a process of self-amplifying capability growth that moves faster than human institutions can track or respond to. The pace of that process is deeply uncertain, but the logical structure of the compounding dynamic is straightforward.

The "discontinuity problem" is closely related. Current AI progress has been relatively gradual in the sense that capability improvements happen over months and years, giving researchers, policymakers, and society time to observe and respond. A recursive self-improvement dynamic could, in principle, produce capability jumps that happen much faster than that, faster than evaluation processes can characterize what a system can do, faster than safety research can develop responses to new capabilities, and faster than governance frameworks can be put in place. Whether such discontinuities are likely is genuinely debated among researchers. But the possibility is real enough to take seriously.

Why it matters

Recursive self-improvement matters primarily because of what it does to oversight. The entire model of AI governance assumes that humans can evaluate AI systems, understand what they're capable of, and impose constraints based on that understanding. If a system improves itself faster than humans can characterize it, the evaluation step becomes unreliable. You might deploy a system based on an assessment of its capabilities that's already outdated by the time deployment happens. You might impose constraints that were appropriate for the system you evaluated but are irrelevant to the system that exists after several cycles of self-improvement. The governance model breaks down not because the AI is hostile but because it's moving faster than the oversight infrastructure can follow.

This is closely connected to the "hard takeoff vs. soft takeoff" debate in AI safety research. A soft takeoff would mean that AI capability improvements happen gradually enough that humans can observe them, understand them, and adjust policies and constraints accordingly. A hard takeoff would mean that the improvements happen fast enough that the window for human adjustment is too short to be meaningful. Most researchers believe that if recursive self-improvement becomes possible at all, the speed of the process will be somewhere between these extremes, but there's genuine disagreement about where in that range it would fall and how much warning time would exist.

What makes recursive self-improvement distinct from other AI risks is that it's a capability amplifier for all of them. A misaligned AI system that can improve its own capabilities becomes more capable of pursuing misaligned goals. An AI used for offensive cyber operations that can improve itself becomes a more capable offensive cyber tool. The risks discussed in every other entry in this series become harder to manage if the AI systems involved can improve themselves. Recursive self-improvement isn't just a threat in its own right; it's a factor that compounds the severity of every other threat.

Where things stand today

Current AI systems participate in their own development in limited but real ways. Large AI companies use AI models to help write and debug training code. Neural architecture search systems use automated optimization to discover network architectures that outperform human-designed ones. AI is used to generate synthetic training data that improves model performance. These are significant instances of AI participating in AI development, but they're quite different from an AI system that can improve its general reasoning capabilities without human direction. The gap between what exists now and the general recursive self-improvement scenario is still large.

The safety research community is working on several approaches to address the risks associated with rapid capability growth. Interpretability research aims to build tools that let humans understand what's happening inside AI systems, which is a prerequisite for evaluating whether a system that has improved itself is still behaving as intended. Capability evaluation frameworks try to systematically test AI systems across a range of tasks to detect unexpected capability jumps. Oversight techniques, including methods for training AI systems to be easier for humans to monitor and check, are a growing area of research. None of these are solved problems, but there's active and serious work on all of them.

The EU AI Act addresses frontier AI systems through its general-purpose AI (GPAI) provisions, which require evaluation of GPAI models for systemic risks, including risks to critical infrastructure and human oversight. Developers of the most capable frontier models face transparency obligations, adversarial testing requirements, and incident reporting duties. These provisions aren't specifically targeted at recursive self-improvement, but they create governance infrastructure that becomes more important as AI systems become more capable of affecting their own development. The challenge is that the regulatory frameworks being built now will need to scale to govern systems significantly more capable than those they were designed around.

How Better Societies helps

Compliance: The EU AI Act's GPAI provisions and frontier model evaluation requirements create specific obligations for organizations developing or deploying the most capable AI systems. Our compliance programs help those organizations understand their obligations under the Act's systemic risk framework, conduct the required evaluations, implement the transparency and incident reporting processes the regulation mandates, and document their safety practices in ways that will satisfy regulators.

Summit: The governance of frontier AI capabilities requires coordination across AI developers, governments, safety researchers, and civil society organizations that currently have limited channels for working together. The Better Societies Summit creates the cross-sector conversations needed to build the shared frameworks, evaluation standards, and capability governance norms that no single organization can establish on its own.

Accelerator: Evaluation tools that can characterize AI capabilities reliably, interpretability systems that make AI behavior transparent to human oversight, and oversight mechanisms that remain effective as AI systems become more capable are all open research and engineering challenges that founders are working on. If you're building in this space, the Better Societies Accelerator supports the technical and policy work needed to ensure that human oversight remains meaningful as AI capabilities grow.

Recursive Self-Improvement: What Happens When AI Can Make Itself Smarter

What this threat is

Why it matters

Where things stand today

How Better Societies helps

Related threats

Loss of Control

Autonomous Weapons

AI Cyber Warfare

Help solve this threat.