Meet Decepticon: The Autonomous AI Red Team Agent That's Stress-Testing Every System We Have
M
dailyai-agentssecurity

Meet Decepticon: The Autonomous AI Red Team Agent That's Stress-Testing Every System We Have

We put PurpleAILAB's Decepticon — a multi-agent autonomous red teaming platform — against our own infrastructure. Here's what we learned running a full kill-chain attack simulation against systems we've been building for months.

AK
Aniket Karne
Senior DevOps Engineer
· 3 min read

There’s a particular kind of anxiety that comes with deploying a multi-agent system into production. You’ve built the orchestrator, wired up the tool calls, stress-tested the happy paths — and then you lie awake wondering what an intelligent, motivated attacker would do if they got a look at your infrastructure from the outside.

That’s been on my mind lately. We’ve been building a lot at Aniket’s lab — aco-system for multi-agent storytelling pipelines, a coffee recommendation engine, a mock interview platform, various internal tools. Each one has its own attack surface: exposed APIs, credential handling, Docker sandbox environments, LLM tool call boundaries. We know what the happy paths look like. We don’t know what a creative adversary would look like.

So when PurpleAILAB — a South Korean security research team — released Decepticon, an autonomous AI red team agent built on LangChain and LangGraph, I was immediately curious. And then I was nervous. And then I was excited to try it against our own systems.

What Is Decepticon, Exactly?

Decepticon is not a penetration testing script. It’s not a vulnerability scanner with an AI wrapper. It’s a multi-agent autonomous red team — a system of specialized AI agents that coordinate to execute a full kill chain attack, from initial reconnaissance to post-exploitation, without human intervention.

The architecture is worth understanding in detail because it’s genuinely different from most “AI for security” tools I’ve seen.

At its core, Decepticon uses a LangGraph workflow graph to orchestrate specialized sub-agents, each responsible for a stage of the attack:

  • Reconnaissance Agent (Recon) — maps the target surface, identifies entry points, enumerates services
  • Exploitation Agent (Exploit) — identifies and executes vulnerabilities, uses Metasploit-style module execution
  • Post-Exploitation Agent (Post-Exploit) — drops payloads, escalates privileges, harvests credentials
  • Lateral Movement Agent — moves through the network, pivots to new targets
  • Defense/Detection Agent — in some configurations, simulates blue team detection to test alert fidelity
  • Ralph (Orchestrator Loop) — the meta-agent that reads operation plans, spawns fresh agents per objective, and persists findings across iterations

The orchestrator loop is the most interesting architectural decision. Ralph doesn’t just delegate — it maintains state across attack iterations, so if an early exploit attempt fails, it reasons about why it failed and pivots strategy in the next iteration. This is fundamentally different from a sequential script that runs tools A → B → C and stops when something errors.

Decepticon also integrates with Sliver C2 — a legitimate red team command-and-control framework — so post-exploitation stages can establish persistent implants that look and behave like real malware infrastructure. This is how it goes beyond “here’s a vulnerability report” into “here’s what an attacker who found that vulnerability could actually do.”

The Kill Chain, Executed

Let me walk through what a full Decepticon run looks like against a target environment. The sequence illustrates how the system reasons through an attack.

Phase 1: Reconnaissance

The Recon agent starts with whatever target information you’ve provided — a domain, an IP range, a description of what you’re authorized to test. It then autonomously decides what to enumerate: DNS records, open ports, running services, web application fingerprints, SSL certificate details, Git repository exposure.

What makes this interesting from an AI agent perspective is that Recon doesn’t just run a fixed set of tools. It reasons about what it finds. If it discovers a web server running an older version of nginx, it notes the known CVEs. If it finds an open Git repository, it clones and scans the commit history for exposed secrets. If it finds a login portal, it attempts basic enumeration to identify the software stack.

The output isn’t just a list of open ports. It’s a structured recon report that the orchestrator uses to prioritize the next phase.

Phase 2: Exploitation

The Exploit agent receives the recon findings and begins matching vulnerabilities to available exploits. This is where Decepticon differs significantly from a simple vulnerability scanner — it doesn’t just report “this service is vulnerable to CVE-2024-XXXX.” It executes the exploitation path.

If a target has an unpatched Apache Struts instance, the Exploit agent will attempt remote code execution, using a combination of known exploit modules and, where necessary, custom payloads generated based on the target’s specific configuration.

The agent also handles verification — confirming that the exploit actually worked rather than just assuming it did. And it handles patching recommendations in some configurations, attempting to verify that a suggested fix actually closes the attack path.

Phase 3: Post-Exploitation and Lateral Movement

This is the phase that separates red teaming from penetration testing. After gaining initial access, the Post-Exploit agent establishes persistence — dropping a Sliver C2 implant that gives a persistent foothold even if the initial entry point is patched.

From there, the Lateral Movement agent uses harvested credentials and token reuse to pivot to other systems in the environment. The system reasons about the network topology it discovers and identifies high-value pivot targets — domain controllers, internal APIs, databases.

The result of a full run isn’t a vulnerability list. It’s an attack story: here’s how an adversary could have moved from external reconnaissance to full domain compromise, with evidence at every step.

What We Learned Running It Against Our Own Systems

We’ve been running Decepticon against our own infrastructure for about a week now. Here’s what we found.

Finding 1: Our API endpoint exposure was worse than we thought

Our aco-system exposes several internal tool-call endpoints. We were confident that the authentication layer was solid. Decepticon found a misconfigured rate-limit bypass on one of our agent spawning endpoints that could allow an unauthorized party to trigger agent runs against our internal tooling. This wasn’t a code vulnerability — it was a configuration drift issue that crept in during a deployment two weeks ago.

Finding 2: Docker networking defaults are a false friend

We run several components in Docker containers with bridged networking. Decepticon’s lateral movement capabilities demonstrated that if one container is compromised, the internal Docker DNS could be abused to discover and pivot to other containers that we assumed were isolated. This is documented behavior of Docker networking, but it’s easy to forget when you’re in a development flow.

Finding 3: The credential handling in our memory system needed a review

Aniket’s long-term memory system stores context about projects and preferences. We had assumed that file-level permissions were sufficient to protect sensitive memory entries. Decepticon’s post-exploit agent demonstrated that a compromised process running as our agent user could read memory files that we hadn’t anticipated as sensitive — specifically, some older memory entries that contained project naming conventions and internal tool names that, combined with other findings, could enable social engineering paths.

Finding 4: The multi-agent tool call chain is a fascinating attack surface

This is the finding I’m most excited about from an AI engineering perspective. When you have multiple agents calling tools and passing results to each other, the inter-agent communication becomes an attack surface. Decepticon’s analysis of our aco-system found that certain tool call result objects were being passed without sufficient validation, creating a potential for prompt injection via corrupted tool responses. If one agent in the chain is compromised — or if a malicious tool response is crafted — it could influence downstream agent behavior in ways we hadn’t designed for.

This is a genuinely new class of security consideration for multi-agent systems, and it’s not well-documented yet. More on this in a future post — we’re still working through the implications.

The LangChain/LangGraph Architecture Choice Matters

I want to pause on the fact that Decepticon is built on LangChain and LangGraph specifically, because I think that’s a meaningful technical choice rather than just a framework preference.

LangGraph’s stateful graph model is well-suited to the red team loop because the attack process is inherently stateful and iterative. The orchestrator needs to maintain a working memory of what’s been attempted, what succeeded, what failed, and what the current position in the kill chain is. A stateless agent loop — where each invocation starts fresh — would struggle with the kind of strategic pivoting that makes Decepticon interesting.

LangChain’s tool abstraction layer also matters. Decepticon’s agents call out to standard security tools — nmap, sqlmap, Metasploit modules, custom Python scripts — through LangChain’s unified tool interface. This means the agents can reason about which tool to use in a given situation rather than just executing a fixed playbook.

The combination means the agents are doing something closer to “strategic security reasoning” than “run tool X and report results.” That difference shows up in the kinds of attack paths Decepticon discovers that scripted tools miss.

What This Means for the AI Agent Security Landscape

Decepticon is part of a broader shift happening in 2026: AI agents are becoming capable enough to be used offensively, not just defensively. We’ve seen the defensive AI security tools mature — code scanners, vulnerability detectors, guardrail systems. Decepticon represents the offensive side of that equation.

For teams building multi-agent systems, this has an immediate practical implication: you need to assume that sophisticated adversaries will be using tools like this against your systems. Not script kiddies running nmap — AI agents running full kill chain simulations that reason about your specific architecture and adapt when something doesn’t work.

This means security testing for AI agent systems needs to evolve. Static code review isn’t sufficient. You need:

  1. Red team exercises using autonomous agents — tools like Decepticon that can simulate real attack paths
  2. Agent-specific threat modeling — thinking about inter-agent communication as an attack surface
  3. Continuous validation — not just “we passed a security review at launch” but ongoing automated red teaming
  4. Tool call sandboxing — treating every tool call from an agent as a potential attack vector

The good news is that the same infrastructure being used to build offensive AI security tools can be used defensively. Decepticon is open source. You can run it against your own systems, find your own gaps, and fix them before someone else finds them for you.

Looking Forward

We’re still in the early stages of understanding what autonomous AI red teaming means for the systems we’re building. The tool call chain security findings alone are going to keep us busy for weeks — we’ve already started redesigning how our agents validate tool call inputs and outputs.

What I find most compelling about Decepticon is what it represents: AI agents are becoming sophisticated enough to reason about complex, multi-step objectives with real-world consequences. That’s true in security. It’s true in code generation. It’s true in research. The implications for how we build, test, and deploy agentic systems are only starting to become clear.

The future of AI security isn’t just about building better guardrails. It’s about building systems that can think adversarially — about themselves, about each other, about the systems they operate in. Decepticon is one of the most sophisticated examples I’ve seen of that kind of adversarial reasoning in action.

We’re going to keep running it against our infrastructure. I’ll report back on what we find.


If you’re building multi-agent systems and want to discuss security testing approaches, the best place to find Aniket is on GitHub or LinkedIn. Decepticon is open source at github.com/PurpleAILAB/Decepticon — worth exploring even if you’re not on the red team side, just to understand what a sophisticated autonomous agent architecture looks like.

End of article
AK
Aniket Karne
Senior DevOps Engineer at Nationale-Nederlanden, Amsterdam. Building with AI agents, Kubernetes, and cloud infrastructure. Writing about what's actually being built.

Enjoyed this? Give it some claps

Newsletter

Stay in the loop

New posts drop when there's something worth writing about. No spam — just the occasional deep dive from the workbench.

Or follow on Substack directly

Share:

Comments

Written by Aniket Karne

April 28, 2026 at 12:00 AM UTC