Article

How to secure multiagent AI: A layered strategy for enterprises

By Sathish Kumar Saminathan, Jitendra Jain, Ashok Kumar Murugesan, Harry Keir Hughes

12 Jun, 2026
12 min read

Insights

Multi-agent AI enhances speed, scalability, resilience, and data quality.
Yet interconnected agents, tools, and data pathways expand the attack surface.
A compromised agent can propagate harmful actions across the wider system.
Principal risks include prompt hijacking, data leakage, and privilege escalation.
Organizations require layered security, human oversight, and phased implementation.

Autonomous systems that plan, reason, and make decisions without human input are a growing force in industry. An estimated $2 trillion is set to be to be spent on AI in 2026, according to Gartner, demonstrating that value and enterprise resilience is becoming the key differentiator for organizations building out this technology.

As part of this, agentic AI is evolving from single agent to multi-agent architectures. In multiagent systems, autonomous agents interact, collaborate, or compete to achieve an organization’s goals. These agents interact through standards such as the model context protocol and agent communication protocol, which help agents built on different software stacks work together.

For example, in a multiagent retail architecture Infosys are building, coordinator agents first scan multiple e-commerce sites like a consumer would normally do, and then call data, reasoning, and integration agents to apply consumer preferences, and evaluate products, pricing, and availability to recommend purchase-ready options, all within the same chat session (Figure 1).

Figure 1. Multiagent architecture with shared access to databases, APIs, models, and memory

Source: Infosys

The use of specialized multi-agent architecture can also be used to reduce noise and improve data quality; as part of faster incident response as multiple agents can evaluate and respond to threats more quickly; the ability to scale quickly by adding agents without redesigning core architecture; and increasing system resilience - since the failure of one agent doesn’t necessarily bring down the whole system.

The threat of multi-agent systems

But multi-agent systems are different to single agent implementations, and the security ramifications are worrying.

Conventional AI security focuses on building robust defenses against adversaries, making sure data isn’t poisoned, and that AI models are reliable.

With the more decentralized architecture used by multi-agent AI, these security models are insufficient as there are more interacting components, trust links, tools and data paths that can be compromised.

Since one agent’s vulnerabilities can cascade across agents as they collaborate, this increases the threat, for example making harmful instructions, prompt injections, or data leaks harder to detect and contain.

For example, multi-agent systems are transforming security operations centers by enabling distributed intelligence and automation. However, this security approach suffers from its own vulnerabilities: Organizations need to prevent malicious agents from releasing enterprise data, ensure proper escalation of genuine threats, and maintain detailed audit trails so that recurring threats can be spotted quickly.

Other attack vectors include:

Prompt injection and instruction hijacking: Malicious inputs compromise an agent and this failure node propagates to peers. For example, a detection agent relays hidden information, causing inappropriate responses. This is a risk because some agents work with signals or data that are not available to other agents, but which still influence decisions.
Cross-agent data leakage: Sensitive context shared across agents can be extracted inadvertently, while knowledge held by only a small subset of agents further complicates detection.
Privilege escalation: Decentralized capabilities allow for a gradual increase of agentic privileges that they shouldn’t have acquired. This threat vector is heightened especially when capability tool discovery is dynamic – or when the tools adapt their goals and plans from human feedback and further context provided to the AI system. This arises because the agents now use a larger number of untrusted inputs, and the model itself is now modifiable, which increases the number of entry points that other agents can use to increase their own privileges.

A layered security approach

Organizations deploying multi-agent systems have to ensure all agent identities are managed – they should be identified, authenticated, authorized to access databases and other content repositories, and monitored within the perimeter of the organization, as discussed in the latest Infosys Knowledge Institute report, Tech Navigator: A path the agentic-first enterprise.

However, existing AI security models don’t address threats that emerge specifically from how an agent interacts with users and other agents and not just from its internal model or training data.

This leaves enterprises exposed without clear governance or protection – an enterprise posture that can derail AI-first transformation journeys. Satish H.C, chief delivery officer at Infosys, speaking at the World Economic Forum in January 2026, said that the security risks arising from agentic systems is the number one concern in his mind for the years ahead.

However, a layered security approach can address some of these concerns. The framework used by Infosys is composed of seven foundational pillars:

Identity management: This layer establishes strong agent identity management using decentralized identifiers, a standard created by the World Wide Web Consortium (W3C) to give people, organizations, and devices self-controlled digital identities so that they don’t have to rely on central authorities. Some countries like Estonia and India are already doing this at scale – providing citizens with full access to the agentic world in the economic, social, and political realm,
Secure communication management: Secure communication relies on end-to-end encryption and integrity checks to protect agent-to-agent and agent-to-system communication channels.
Authorization and policy guardrails: This layer implements fine-grained control through attribute-based access control, where the agent’s characteristics such as its available resources or available actions - rather than roles or usernames - are the defining triggers determining what an agent is or isn’t allowed to do. It also enforces permission tickets, granting an agent the right to take a specific action and nothing more.
Behavioral monitoring and anomaly detection: This layer tracks message semantics – or the rules that define what a message means so that security teams know how it should behave when it’s sent, received, and processed by the agentic system; frequency; and resource usage. It also uses graph-based analytics, a way of analyzing relationships between agents, data, tools, and actions, to detect collusion patterns and abnormal behaviors.
Explainability and auditability: This layer provides a detailed trail of how decisions were made by AI agents, using graphs to link inputs, policies, and outputs, along with immutable audit logs for compliance and forensic analysis.
Formal trust modelling, or TCTLC: This layer is used to decide how much to trust each agent at any moment of time. TCTLC stands for trust calculation and trust lifecycle control, a way of measuring, updating, and controlling trust between agents over time.
Human-in-the-loop governance: Finally, this layer ensures human oversight for high-risk decisions. It does this through escalation points, predefined triggers where an agent stops acting autonomously and refers the situation to a higher authority, red-team simulations, and periodic assurance reviews. Organizations can then understand their security strengths and weaknesses from an adversary's perspective. Red teaming needs to be a continuous exercise, which, when done well, validates and reveals gaps that can be easily countered, as well as shed light on future threat strategies.

Different architecture, different security posture

The specific architecture of a multi-agent AI system shapes its security posture, governance model, and operational resilience.

Each pattern introduces advantages and challenges that influence risk exposure and mitigation strategies. Understanding these implications is critical for designing secure and trustworthy multi-agent ecosystems that balance performance with compliance and resilience (Figure 2).

Figure 2. Different architectural patterns demand different security controls

Source: Infosys

Going beyond the technology

Infosys has worked with clients to create a set of operational playbooks to guide secure and resilient multiagent AI operations. These playbooks serve as tools for maintaining security, resilience, and trust across agent-driven systems. They offer structured, repeatable approaches to prevent attacks, mitigate risks, and ensure continuous compliance.

The playbooks should involve a strategy for prompt injection defense, data leakage prevention, collusion detection, cascading failure resilience, and continuous assurance (Figure 3).

Figure 3. Playbook imperatives for different multiagent security strategies

Source: Infosys

A roadmap to success

However, a phased roadmap should be used to enable multi-agent deployment at enterprise-scale. The following roadmap outlines the approach Infosys has been using with client implementations.

Phase one - immediate actions: In the first three months, organizations should start by working out what could go wrong in multi-agent systems – the security risks and possible breaches. Teams should list every AI agent used in the organization, what it can do, and what sensitive systems or data it can reach. Then these same teams should put basic protections in place so agents and systems prove who they are before talking to each other, and so each agent only gets the minimum access it needs. It is important to log what agents do and don’t do, while monitoring unusual behavior. To help further, deploy mutual transport layer security (mTLS), so that both the agent and the connected system authenticate each other before exchanging any data or commands. Also implement attributed-based access control for privilege management.

Phase two - medium-term actions: Over the next three months to a year, the multi-agent system should become easier to check. Tools should be added that can explain, with plain reasoning, why agents made a decision, so the Responsible AI office can review and audit it. Multiagent decision-making then becomes transparent and auditable. Create a way to score how trustworthy an agent or action is, and ask security teams to test the setup and look for weaknesses. In this way, organizations can practice what they would do if something goes wrong – including how to spot an incident; how to stop it spreading; and how to recover from security breaches quickly and safely. To help further, implement trust modeling approaches appropriate to the organization’s agentic architecture, including fuzzy logic, where systems are scored on degrees of truth rather than Boolean logic.

Phase three - long-term actions: After one year and beyond, improve detection and oversight. Use even smarter monitoring that can notice things like collusion - where agents start to learn to work together in a harmful way - or new risks that appear only because the system is so complex. Finally, once at an appropriate level of multi-agent trust and security, build teams that keep up with new rules and standards for autonomous systems, sharing what they learn, while simultaneously investing in research, training, and testing so security keeps improving as the agents become more capable. To help further, develop governance boards and audit committees for oversight of multiagent systems while participating with emerging standards efforts, including the autonomous control policy and the autonomous negotiation protocol.

Moving ahead

Moving ahead, two techniques will be important in enterprise ecosystems deploying agentic AI.

Self-healing agents will spot compromise signals, isolate affected components, roll back to known-good states, and quickly restore operations.

In parallel, collaborative risk management turns competitors into allies; shared indicators, playbooks, and lessons learned continuously improve every deployment, raising the baseline for multi-agent security across the enterprise and the industry.

For leaders, securing multi-agent AI is now a board-level priority. Organizations should act now to build identity, controls, monitoring, and human oversight in from day one, and collaborate across industry to sustain trust at scale.

Authors

Sathish Kumar Saminathan, Jitendra Jain, Ashok Kumar Murugesan, Harry Keir Hughes