▸ Concept

Multi-agent safety

The unsolved problem of keeping networks of AI agents — each acting on instructions from another — from compounding errors, deferring accountability, or being manipulated through the messages they exchange.

Learn first

Agentic AI AI alignment

In a nutshell

When one AI agent orchestrates another, each step adds a new place for things to go wrong: a subtask is misread, a tool call returns unexpected output, or a malicious instruction injected into the environment gets passed along as if it were legitimate. The hard part is that individual agents can behave correctly in isolation while the system fails: trust, blame, and oversight become distributed across a chain that no single layer controls. No established discipline yet covers all of this — threat models, evaluation methods, and governance norms are still being worked out.

In megatrends

Artificial Intelligence

Models, agents, and AI–human collaboration — general-purpose capability scaling into every domain.

How this connects

Tap a node to open it

Multi-agent safety

Learn first

In megatrends

Artificial Intelligence

Finds citing this concept

A field that doesn't exist yet

How this connects