
Just a few years ago, collaborating with AI was a solitary task, involving little more than exchanging messages in a chat tab. It was a 'one person, one tool' structure where the user asked a question, the AI answered, and the user followed up.
However, Anthropic has declared that this approach is now a relic of the past. In a post titled 'Building Effective Human-Agent Teams' published on its official blog on the 24th, the company noted that as AI increasingly handles complex and sophisticated tasks—such as coding, research, and financial analysis—the collaboration paradigm is shifting. The way we work is evolving from a 'single-player' to a 'multi-player' era. The days of one human unilaterally commanding one agent are over. The future lies in building a 'complete team' where humans set the strategy and multiple AI agents divide the execution.
This report summarizes four lessons Anthropic learned from months of internal experimentation. Interestingly, these lessons are by no means new concepts. Clear goal setting, defined roles, thorough documentation, and shared quality standards are exactly the same 'good team habits' that human organizations have adhered to for decades. The difference is that with advanced AI agents joining the team, strictly following these basic rules has become more critical than ever.
From a 'Tool' You Call to a 'Colleague' Who Resides

What does it actually look like to 'work as a team with AI'? Anthropic defines it as a 'multi-player agent'—an AI that collaborates simultaneously with an entire organization rather than just an individual.
There is a clear distinction between these and the chatbots we commonly use. Technically, they might seem similar in that they possess their own memory and skills. However, the decisive difference lies in their 'workspace' and 'qualifications.' A multi-player agent is granted its own credentials, independent of any specific individual. Above all, it resides in the space where actual work takes place. In Anthropic's case, the stage is a collaboration tool like Slack. It is not a one-off tool that only turns on when a user calls it, but rather something closer to a colleague that resides in a team channel.
For an AI agent to pull its weight as a member of an organization, three prerequisites—much like a human's 'hiring conditions'—are essential.
The first is 'persistent memory' to remember team goals and coordinate execution. The second is 'credentials separate from humans,' which ensures the agent operates only within safe and predictable boundaries. Because the agent does not borrow a human account, its scope of work is clearly defined. The third is 'broad information access,' allowing it to learn how the organization operates and process tasks according to its goals.
Only when these three pieces of infrastructure are in place can an agent join the team. However, joining is one thing, and performance is another. These are merely technical foundations. For a human-agent team to actually deliver results, human change must come first. New ways of collaborating and clear norms that all team members must follow are required. Anthropic derived four lessons from its internal experiments.
Lesson #1: Work in the Open

For an AI agent, there is no such thing as a 'private conversation' or 'tacit agreement.' Agents understand context based only on text that the organization has left searchable—Slack messages, source code, shared documents, and meeting minutes. Personal direct messages (DMs), water-cooler chat, or documents with restricted access provide no context to an agent. If it isn't recorded and accessible, it doesn't exist to the agent.
For this reason, Anthropic does not set access permissions document by document or channel by channel. Instead, it sets a few broad, clear 'security boundaries' at the level of the entire Slack workspace or document library. Within these designated boundaries, all context flows to both humans and AI. Constantly agonizing over whether a specific channel should be public or if a document can be shared causes decision fatigue for both humans and AI. Simplifying these boundaries eliminates such inefficiencies.
The effect of information transparency is clear. An agent that understands what was canceled in a meeting will not propose a discarded task. An agent that has learned the product specifications of other teams may even recommend proven success patterns. Because agents read vast amounts of text at speeds far exceeding humans, they act as problem-solvers who frequently point out related tasks that practitioners might have missed.
Of course, this does not mean that sensitive conversations must be fully exposed. If security is required, you can always send a DM to Claude or use existing services. Changing information sharing to an organization's 'default' requires a cultural shift in the workplace. However, the productivity gap between teams that move with an agent that holds all the context and those that do not will become impossible to ignore.
Lesson #3: Define Roles Clearly

The key to successful collaboration is not technology, but 'who does what.' A team combining humans and agents shares rosters, outputs, and workspaces entirely. In this setup, agents perform different roles, each with its own unique permissions, skills, and tool access. One agent might handle data analysis, another might audit design standards, and a third might synthesize research materials.
When a project begins, team members first communicate with agents to establish specific roles and collaboration processes. Once roles are confirmed, one agent can call another, passing specific tasks to the right person with the appropriate memory and access rights. The important point here is to provide the tools necessary for the job. A data analysis agent needs access to 'BigQuery,' and a QA agent needs 'Playwright MCP' to perform at its best.
If role division is blurry, members end up running their own individual AIs separately, leading to duplicated work and fragmented team context. A prime example is tracking data metrics. If one multi-player agent is dedicated to this task, the entire team can look at the same numbers. Humans should collaborate with agents in the same thread, focusing only on the areas that only humans can handle: 'final judgment and strategic decision-making.'
In fact, one engineering team at Anthropic created a work 'roster' to formalize the roles of humans and agents. Defining roles in advance as skill files not only makes specialization easier, but also allows anyone in the company to quickly replicate and deploy the type of agent they need. As a project grows, you simply add an agent to handle the new area. This team recently 'hired' a new 'Release Manager Agent' to handle the software deployment process.
Lesson #4: Set a Clear Goal (North Star)

The decisive difference between an agent that only does what it's told and one that proactively suggests new tasks lies in 'clear direction.' The agents playing pivotal roles at Anthropic do not stop at completing assigned tasks; they go a step further to propose new projects and workflows. This proactivity emerges when a team with rich context and clear role division adds one final element: an unwavering value-oriented point, the 'North Star.'
The North Star is an ambitious, macro-level goal that helps team members judge whether the task at hand is moving in the right direction. At Anthropic, the entity that establishes this North Star is always human, rooted in the company's mission and business goals. Once the North Star is codified in text, human team members share it with the agents. Humans then make the final decision on which agents to empower to lead new workflows, as not every agent in the team possesses the necessary skill and reliability.
There is a real-world example. A team at Anthropic developing internal tools set a North Star to 'make the product onboarding process more useful.' An agent on the team then proactively suggested modifying the error message text that appeared during onboarding. This small change actually led to an increase in the weekly onboarding success rate. A clearly defined North Star like this provides agents with consistent behavioral guidelines and opens up practical opportunities for them to actively contribute to team productivity.
Lesson #5: Expand Autonomy in Proportion to 'Trust'

They didn't just hand over 500 bugs to an agent from the start. Anthropic teams grant autonomy only to the extent of the reliability the agent has proven, and then carefully expand the scope of work. While there are success stories of engineers tasking agents with handling 500 bugs independently, the start was gradual.
Just as it takes time for a new colleague to join a team, gauge their capabilities, and get in sync, the same applies to agents. An adaptation period is essential to identify the agent's performance by assigning various tasks and to learn how to explain goals and which skill files and prompts yield the desired results. It is also important to re-test existing tasks whenever Large Language Models (LLMs) are updated. Safety guardrails that once prevented an agent from going off-track might become shackles that hinder creative problem-solving for a more advanced model.
The keyword Anthropic emphasizes most in this process is 'verification.' Agents that performed exceptionally well in the long term had multiple mechanisms to verify their output before human review. They would attach automated tests to source code and apply clear rubrics and style guides to technical documents. By establishing standards and designing every process to be verifiable, the final quality stays on track. This is why a 'Doer-Verifier' structure—where one agent performs the work and another cross-checks it—is frequently used.
An engineering leader at Anthropic formed a task force combining a few team members and many agents to resolve a massive project backlog. First, a group of agents scanned the entire backlog to check for owners, then calculated the complexity of unassigned items and scored them. Another group of agents then selected low-to-medium difficulty tasks from that list and modified the source code directly.
In the initial phase, the leader reviewed every decision made by the agents to categorize topics that required human judgment. Later, they trained the agents to judge these exceptions and escalate (report) them directly to the leader. They built an efficient system where humans intervene only in key decisions involving sophisticated trade-offs.
Furthermore, this team had agents write their own weekly reports documenting 'Lessons & Missteps' to prevent the repetition of the same mistakes. Over time, the leader was able to delegate increasingly complex tasks to the agents, drastically reducing the time spent on guidance. Once the agents achieved independent autonomy, they were taught that 'human attention is the scarcest resource.' The agents' behavioral patterns were designed to batch questions, summarize key context to help human team members catch up quickly, and limit the number of items a person reviews at once.
In the End, It Comes Back to the Basics of a Good Team
Concluding the post, Anthropic proposes a five-point self-diagnosis kit for human-agent teams.
| ✅ | ① Information Transparency · Is information transparently shared and searchable for both humans and AI. ② Role Clarity · Are there separate slots for humans and AI on the team roster, and are their respective tasks distinct. ③ Tool Appropriateness · Does the agent hold the weapons (permissions) to do its job. ④ Quality Verifiability · Are there evaluation criteria and tests in place to cross-verify outputs. ⑤ Milestone Clarity · Is there a clear North Star that everyone is looking toward. |
The destination of these questions is clear: direction, roles, documentation, standards, and the room to learn from mistakes. These are the basics of organizational culture we already knew. AI agents are not some fantasy-like new technology. Rather, they are a mirror showing how a team falls apart when it skips the basics. The teams that achieve the best results by making a powerful partner like an agent their ally are the ones that are most persistent in upholding these obvious basics.
Sort by:
Comments :0
