Agentic AI

Multi-Agent Architectures

Max Corbridge

Cofounder

July 10, 2025

Hello and welcome back! Apologies for the lack of updates last 2 weeks, I had some personal leave, but I am now back and talking this week about how multi-agent architectures are built. We've covered a good amount of agentic AI threats, breaches and testing guides recently, but I think it's still quite easy to stay in the theoretical world whilst discussing these without needing to really understand how these systems are built and look in production. My plan today is to cover some high level architecture for multi-agent systems, and hopefully explore a few real-world systems too!

First I want to get started by saying that we will be ignoring all 'no code' and 'low code' type architectures, such as n8n. Whilst these products are great for getting AI agents into the hands of more people with less of a development background they are ultimately designed for ease of use and not transparent architectures, which is what we are interested in here. So, to get started I did what I always do when I am deep diving a technical topic and opened up countless YouTube videos on the topic to binge with coffee. I like to find a range of depths, perspectives and biases, watch all of them (usually on 2x speed), make notes on each and then summarise in my own words. So, let's get started!

First and foremost, why do we use several agents and not just one? Agents are theoretically capable of handling any number of tools, so why not just give them access to everything it needs to perform and task and let it do its thing? Well, the supposed sweet spot for tools is around 5-10 per agent. This allows agents to perform several useful tasks, but doesn't drown the agent in myriad tool calls, competing goals, and overly long context windows. What tends to work much better is having agents which are specialised for certain tasks, like a planner, a researcher, a mathematician. This setup not only improves performance, but also introduces modularity which can be great for debugging, maintenance, and greater control.

Architectures

Okay, with a clear understanding why we might want to use several agents in unison let's look at how we might want to orchestrate this ensemble. For this I'll mainly be going off LangChain's brilliant video on the topic. The first, architecture we will look at is a 'network' architecture.

In this architecture all agents with their tools are free to talk to each other at will. Ultimately, it is the decision of the agents themselves who they communicate with and why. Models built on this architecture include CrewAI and OpenAI's Swarm, so it's certainly a popular and well-used architecture. The problem with network architecture, however, is that it's communication pattern is too loose given the agent-driven nature of it. Therefore, you may not see this deployed in production all that often.

Introduce, Supervisor. Perhaps the most famous of the architectures is the Supervisor. In this model one single agent decides upon the action that needs to be taken, and drives the action forward, routing the communication through the agents that it sees fit. This brings the communication format together in a stricter and more repeatable fashion, which may well be required for business-adoption. There is also a version of supervisor where a single supervisor is also given access to a handful of tools which it treats as sub agents and calls upon them in the same way it would call upon a fully-fledged agent. This could simplify design if only a small number of tools are going to be needed.

Next we have Hierarchical. Hierarchical architectures are great for handling complex problems. They go beyond just a single supervisor and it's agents and build clusters of supervisors + agents which all report back to a central supervisor at the top of the hierarchy. Naturally, this approach lends itself to tasks that require a ton of agents, tool calls, central state management, etc.

Finally, we have the most common which is something entirely Custom. With such a nascent technology it is no surprise that many of these architectures may get you close to what you need but will likely require a good deal of customisation to ensure it serves you as you need it to. This customisation is often the missing piece that can get a mutli-agent system to the point where it is production-ready.

So, with the above we have a good understanding of how these agents are laid out and roughly how they communicate with each other. One thing that may still escape some readers though is where is this all happening? Are agents individual systems all living on a single machine? Do they run as scripts? Do they have individual identities? Even at the depth we've gone it can be hard to answer some of these fundamental questions, so let's have a go.

Where do agents exist?

In essence, an 'agent' is usually a few lines of Python code which has a memory of what is happening around it, referred to as a 'state'. Through more Python functions the agent receives a conversation / input / task from the outside world, it creates an LLM prompt to decide what to do, it optionally calls a tool if it believes this is needed, and it produces the outcome and passes it back to the 'supervisor' (assuming we are using a supervisor architecture).

Okay great, but you've still not answered the question.where is this usually happening? Well, there are a few options for this. When it comes to the 'agents' themselves these may be running on containers as microservices. For example, these could be deployed on Kubernetes clusters and every time an incoming request is received the container spins up, performs it's LLM call, generates the response, and then returns to an idle state. Similarly, this could be abstracted even further to run the agent as just a serverless function within your favourite cloud provider.

Equally, much of this is now being wrapped up into the major cloud providers various AI 'studio' offerings, such as Copilot Studio, OpenAI Playground, SageMaker Studio Lab, etc. These are designed to streamline the ease with which people can start building this stuff by taking the hosting of your agents and underlying LLMs out of the equation.

So, by going through this I've certainly shored up some of my own patchy understanding around how this stuff actually works and runs in the real world. I'm interested in talking about the different types of agents such as simple reflex, utility, or learning agents which will hopefully further clarify our collective understanding on how these systems work, so perhaps I'll cover that next week!

blogs

Our Latest Thoughts

Interviews, tips, guides, industry best practices, and news.