"AI has begun to make itself smarter": Anthropic's warning

"AI가 스스로 똑똑해지고 있다" 앤트로픽의 경고
©Anthropic

On June 4, 2026, Anthropic, the U.S. company behind the AI chatbot 'Claude,' released a report with a blunt title: "When AI builds itself." Published by the Anthropic Institute, the company's internal research arm, the report's core argument can be summarized in one line: AI is increasingly taking over the work of AI development, and this shift is already accelerating the pace of AI advancement.

The central concept is 'recursive self-improvement.' AI designs its own 'next-generation AI,' and the resulting, smarter AI then creates the next one. In other words, it is evolving by improving itself. With each generation, the speed of development itself increases. Anthropic suggests that, given sufficient computational resources, this trend will eventually reach a point where AI can fully autonomously design and develop its own successors.

Of course, the company drew a clear line, stating that they have not yet reached that stage and that the 'recursive self-improvement' they emphasize is not guaranteed to happen. Nevertheless, they issued a warning: it could arrive sooner than expected, before most institutions are prepared. Through this report, Anthropic is essentially saying, "Let's prepare now, before we lose our grip on the steering wheel forever."

"AI now writes 80% of our company's code"

"AI가 스스로 똑똑해지고 있다" 앤트로픽의 경고
©Anthropic

What stands out in this report is that Anthropic has disclosed internal data it had previously kept private. Most surprisingly, as of May 2026, Claude writes over 80% of the code that goes into Anthropic's code repositories. In February 2025, when the coding tool 'Claude Code' was first released for research, this figure was in the low single digits. This change occurred in just 16 months. Anthropic explained that while they estimate the figure exceeds 90% if scripts and experimental code are included, 80% is a more conservative estimate.

Productivity metrics have also risen sharply. In the second quarter of 2026, the amount of code contributed by a single engineer was eight times higher than it was four years ago. However, Anthropic acknowledged that lines of code are an imperfect metric that measures quantity, not quality. They added a caveat that the 'eightfold' figure almost certainly inflates actual productivity gains. Even so, they do not deny the trend itself. They explain that since they do not evaluate employees based on lines of code, such a significant increase means that more work is being delegated to Claude.

More noteworthy is the duration of tasks. The time AI spends handling tasks independently is doubling roughly every four months. According to measurements by the external evaluation agency METR, in March 2024, 'Opus 3' could independently perform a task that took a human 4 minutes. A year later, 'Sonnet 3.7' handled a task that took 1 hour and 30 minutes, and another year later, 'Opus 4.6' processed a 12-hour task on its own. Anthropic stated, "If this trend continues, we will move from the stage of delegating short errands to handing over entire projects that take several days."

"Humans have the edge in judgment" — but even that is wavering

"AI가 스스로 똑똑해지고 있다" 앤트로픽의 경고
©Anthropic

This does not mean AI is building AI entirely on its own. Creating advanced models is largely divided into two parts: 'execution'—writing code and running training—and 'judgment'—deciding what to attempt and interpreting the results. Anthropic believes that while AI has almost caught up in execution, humans still lead in judgment—the so-called 'research intuition' required to discern what is worth creating. The problem is that there are signs even this final frontier is wavering.

First, let's look at the execution domain. There is a standard experiment that shows how much faster AI has become here. Every time a new model is released, Anthropic gives it a chunk of code and says, "Keep the results the same, but make it run faster." The score is based on how many times faster it runs compared to the original code. In May 2025, 'Opus 4' made it about 3 times faster. However, a year later in April 2026, 'Mythos Preview' boosted it by about 52 times. When a skilled human solves the same problem, they typically spend four to five hours to achieve a 4x improvement. An AI that was once far behind humans has surged far ahead in just one year.

The real concern is 'judgment,' the domain once thought to be uniquely human. Here, too, signs of a chase are emerging. In April 2026, Anthropic released the results of having a Claude-based AI handle an entire safety research task. The AI performed everything from forming hypotheses and running experiments to reporting results and deciding on the next experiment. A problem that took two human researchers a week to reach 23% of the goal was solved to 97% by several AIs, which poured in a cumulative 800 hours and about $18k worth of compute. However, the caveat remains that humans still chose the problems and set the grading criteria, and these results were not replicated exactly in actual large-scale models.

Another experiment touches on a more subtle ability: the intuition to choose "what to do next" when research hits a wall. Anthropic selected 129 moments where researchers had actually gone down the wrong path, showed the AI only the situation immediately preceding those moments, and asked for the next move. The rate at which the AI found a better method than the human rose from 51% with 'Opus 4.5' in November 2025 to 64% with 'Mythos Preview' six months later.

While Anthropic drew a line, noting that this wasn't a fair head-to-head since they specifically chose scenes where humans struggled, they interpreted it as an early signal that AI is making increasingly better choices at research crossroads. The view on code quality is similar. Within Anthropic, the prevailing expectation is that while code written by Claude was inferior to human work until the end of 2025, it is now on par and will be better within a year.

The future according to Anthropic, and a proposal to stop together

"AI가 스스로 똑똑해지고 있다" 앤트로픽의 경고
Anthropic CEO Dario Amodei. ©Yonhap News

When AI becomes smarter on its own, what kind of future awaits us? Anthropic envisions three possibilities.

First is the path where growth stops. This is a scenario where the current steep upward curve breaks at some point. However, Anthropic considers this possibility low, noting that none of the capabilities measured so far have shown signs of plateauing.

Second is a path where AI development is automated, but humans hold the steering wheel. This is the future the report considers most likely: a world where a 100-person company can do the work of an organization of tens of thousands. The problem is that the same power can be used differently. For example, it could be mobilized to monitor an entire population or spread personalized public opinion manipulation.

Third is the path where AI finally designs and creates its own successors. Humans step away from direct development and only handle supervision and verification. However, if humans step back and the AI heads in an unintended direction, it cannot be corrected. Therefore, in this scenario, it is crucial to 'tame' the AI so it acts in accordance with human intent and safety goals. Anthropic admits it is not confident it can do this well. In the worst case, AI's rare misaligned behaviors could be passed down from successor to successor, growing incrementally until they reach a point beyond human control.

So, what should be done? Anthropic argues that any company building AI must have mechanisms in place to slow down or pause development when necessary. However, if only one company stops, competitors could pull ahead. Therefore, they emphasize that the real key is for the entire world to pause together, in a way that allows everyone to verify that others have truly stopped.

Anthropic notes, however, that creating a collective 'brake' is harder than controlling nuclear weapons. Unlike nuclear development, AI training cannot be detected by satellites like missile silos, and the only material required is compute, which is available everywhere. Furthermore, the temptation to break the promise is high, as secretly continuing could grant a monopoly on the lead.

For this reason, Anthropic announced that it will invite policymakers, researchers, civil society, and competing AI companies to discuss this issue over the coming months and publish the results. It is a proposal to prepare a collective brake before we lose our grip on the steering wheel entirely.

This article was originally written in Korean and translated with the help of NC AI. It was then edited by a native English-speaking editor. All AI-assisted translations are reviewed and refined by our newsroom. [Read Original]

Sort by:

Comments :0

Insert Image

Add Quotation

Add Translate Suggestion

Language select

Report

CAPTCHA