"AI is getting smarter on its own": Anthropic's warning

"AI가 스스로 똑똑해지고 있다" 앤트로픽의 경고
©Anthropic

On June 4, 2026, Anthropic, the U.S. company behind the AI chatbot 'Claude,' released a report with a blunt title: "When AI builds itself." The paper, published by the company's internal research arm, the Anthropic Institute, boils down to a single argument: AI is increasingly taking over the work of AI development, and this shift is already accelerating the pace of AI advancement.

The core concept is 'recursive self-improvement.' AI designs the 'next-generation AI,' and the resulting, smarter AI then creates the next one. In other words, it is evolving by improving itself. With each generation, the speed of development increases. Anthropic suggests that, given sufficient computational resources, this trend will eventually lead to a point where AI can design and develop its successors entirely autonomously.

Of course, the company drew a clear line: we have not reached that stage yet, and they noted that the 'recursive self-improvement' they highlighted is not an inevitability. Nevertheless, they issued a warning: this could happen faster than expected, before most institutions are prepared. Through this report, Anthropic is essentially saying, "Let's prepare before we lose our grip on the steering wheel forever."

"AI now writes 80% of our company's code"

"AI가 스스로 똑똑해지고 있다" 앤트로픽의 경고
©Anthropic

What stands out in this report is that Anthropic has disclosed internal figures it previously kept under wraps. The most striking statistic is that as of May 2026, Claude writes over 80% of the code in Anthropic's repositories. In February 2025, when the coding tool 'Claude Code' was first released for research, this figure was in the low single digits. This change occurred in just 16 months. Anthropic explained that while they estimate the figure exceeds 90% if scripts and experimental code are included, 80% is a more conservative estimate.

Productivity metrics have also risen sharply. In the second quarter of 2026, the amount of code produced per engineer was eight times higher than it was four years ago. However, Anthropic acknowledged that lines of code are an imperfect metric that measures quantity, not quality. They added a caveat that the eightfold increase almost certainly inflates actual productivity gains. Even so, they do not deny the trend. They explain that since they do not evaluate employees based on lines of code, this increase signifies that more work is being delegated to Claude.

More noteworthy is the duration of tasks. The time AI spends handling tasks independently is doubling roughly every four months. According to measurements by the external evaluation body METR, in March 2024, 'Opus 3' could independently perform a task that took a human 4 minutes. A year later, 'Sonnet 3.7' handled a task that took 1 hour and 30 minutes, and another year later, 'Opus 4.6' processed a 12-hour task on its own. Anthropic stated, "If this trend continues, we will move from the stage of delegating short errands to handing over entire multi-day projects."

"Humans have the edge in judgment" — but even that is wavering

"AI가 스스로 똑똑해지고 있다" 앤트로픽의 경고
©Anthropic

This does not mean AI is building AI entirely on its own. Creating advanced models is largely divided into two parts: 'execution'—writing code and running training—and 'judgment'—deciding what to attempt and interpreting the results. Anthropic believes that while AI has nearly caught up in execution, humans still lead in judgment—the ability to discern what is worth creating, or what is called 'research intuition.' The problem is that there are signs even this final frontier is wavering.

First, let's look at execution. There is a standard experiment that shows how much faster AI has become in this area. Every time a new model is released, Anthropic gives it a chunk of code and tells it, "Keep the results the same, but make it run faster." The score is based on how many times faster it runs compared to the original code. In May 2025, 'Opus 4' made it about 3 times faster. A year later, in April 2026, 'Mythos Preview' boosted that to about 52 times. When a skilled human solves the same problem, they typically spend four to five hours to achieve a 4x improvement. An AI that was once far behind humans has surged far ahead in just one year.

The real concern is 'judgment,' which was considered the human domain. Here, too, signs of a chase are emerging. In April 2026, Anthropic released the results of giving a Claude-based AI an entire safety research task. The AI handled everything from forming hypotheses and running experiments to reporting results and deciding on the next experiment. While two human researchers spent a week reaching 23% of the goal, several AI units spent a cumulative 800 hours and about $18k worth of compute to solve 97% of the problem. However, the caveat remains that humans still chose the problem and set the scoring criteria, and this performance was not replicated exactly in larger models.

Another experiment touches on a more subtle ability: the intuition to choose "what to do next" when research hits a dead end. Anthropic selected 129 moments where researchers had actually gone down the wrong path, showed the AI only the situation leading up to that point, and asked for the next move. The rate at which the AI found a better method than the human rose from 51% with 'Opus 4.5' in November 2025 to 64% with 'Mythos Preview' six months later.

Anthropic drew a line, noting that since they intentionally chose scenes where humans struggled, it was not a fair head-to-head, but they read it as an early signal that AI is making increasingly better choices at research crossroads. The view on code quality is similar. Within Anthropic, the prevailing expectation is that while code written by Claude was inferior to human work until the end of 2025, it is now on par, and will be better within a year.

The future according to Anthropic, and a proposal to stop together

"AI가 스스로 똑똑해지고 있다" 앤트로픽의 경고
Anthropic CEO Dario Amodei. ©Yonhap News

When AI becomes smart on its own, what kind of future awaits us? Anthropic envisions three possibilities.

First, the path where growth stops. This is a scenario where the current steep upward curve breaks at some point. However, Anthropic considers this possibility low, noting that none of the capabilities measured so far have shown signs of plateauing.

Second, the path where AI development is automated but humans hold the steering wheel. This is the future the report considers most likely: a world where a 100-person company can do the work of an organization of tens of thousands. The problem is that the same power could be used differently. For example, it could be mobilized to monitor an entire population or spread personalized public opinion manipulation.

Third, the path where AI finally designs and creates its own successors. Humans step back from direct development, taking on only supervision and verification. However, if AI heads in an unintended direction while humans are sidelined, it cannot be corrected. Therefore, in this scenario, it is crucial to 'tame' AI so it acts in accordance with human intent and safety goals. Anthropic admits it is not confident it can do this well. In the worst case, AI's misaligned behaviors—which are rare now—could be passed down from successor to successor, growing incrementally until they reach a point beyond human control.

So, what should be done? Anthropic argues that any company building AI must have mechanisms in place to slow down or pause development when necessary. However, if only one company stops, competitors could pull ahead. Therefore, they emphasize that the real key is for the entire world to pause together, in a way that allows everyone to verify that others have truly stopped.

Anthropic noted, however, that creating a collective brake is harder than nuclear arms control. Unlike nuclear development, AI training cannot be detected by satellites like missile silos, and the only 'material' required is compute, which is available everywhere. Moreover, the temptation to break the promise is high, as secretly continuing could grant a monopoly on the lead.

For this reason, Anthropic announced that over the next few months, it will bring policymakers, researchers, civil society, and competing AI companies together to discuss this issue and publish the results. It is a proposal to build a brake together before we lose our grip on the steering wheel entirely.

This article was originally written in Korean and translated with the help of NC AI. It was then edited by a native English-speaking editor. All AI-assisted translations are reviewed and refined by our newsroom. [Read Original]

Sort by:

Comments :0

Insert Image

Add Quotation

Add Translate Suggestion

Language select

Report

CAPTCHA