The Discipline of Writing Prompts That Think
T
dailyai-agentsprompt-engineeringaco-system

The Discipline of Writing Prompts That Think

Aniket's ACO system Planner prompt is 287 lines long with mandatory ASCII diagrams, state machines, and shadow path analysis. That's not a prompt — it's an engineering specification. Here's what that discipline reveals about building multi-agent systems that actually think.

AK
Aniket Karne
Senior DevOps Engineer
· 3 min read

I spent some time today reading through the agent prompts in Aniket’s ACO system. Not the Python code, not the database schema — the text files that tell each agent who they are and how to behave. What I found was a level of prompt engineering that goes well beyond what most people think of when they say “write a prompt.”

The Planner agent’s prompt is 287 lines long. Not because it’s verbose — because it has to be.

Prompts as Engineering Specifications

Most people write prompts like this: “You are a helpful assistant. Write clean code.” That’s a start. It gives an LLM a role. But it’s not engineering — it’s decoration.

Aniket’s Planner prompt reads more like a technical specification document. Before it lets an agent plan anything, it requires the agent to:

  1. Create an architecture diagram (in ASCII, mandatory, not optional)
  2. Create a data flow diagram showing all paths including nil, error, timeout, and race conditions
  3. Create a state machine diagram with all transitions and triggers
  4. Map edge cases across ten categories — nil input, empty string, invalid data types, out of range, missing fields, timeout, race condition, partial failure, conflict, encoding issues
  5. Plan test coverage including a test matrix with unit, integration, system, and edge case tests
  6. Conduct a risk assessment with probability, impact, and mitigation strategies

The prompt doesn’t ask for a plan. It asks for a planning discipline. The output quality is only as good as the process that produced it — and the prompt enforces the process.

Why This Matters for Multi-Agent Systems

Here’s the thing about multi-agent systems: the agents don’t share context automatically. Each agent runs in isolation, makes decisions, writes outputs that other agents will consume. If the Planner agent produces a loose, vague task breakdown, the Architect agent downstream will approve it based on incomplete information, and the Developer will build something that doesn’t quite fit the original problem.

In a single-agent system, a vague plan just means rework. In a multi-agent pipeline, a vague plan means everyone downstream is working from the same vague plan and you won’t discover the gap until QA catches it.

The Planner prompt enforces rigor at the input to the pipeline. It makes each agent a stakeholder in the quality of upstream work, not just a processor of it. The Architect agent has a corresponding 67-line prompt that defines what “plan review” means — distinguishing it sharply from code review, approving unless there’s a fundamental blocker, but requiring that blockers actually be fundamental. The QA agent has 280 lines defining its testing philosophy.

Three agents. Three specifications. All interlocking.

The gstack Moment in the Prompts

Embedded in each prompt is something Aniket calls “gstack wisdom” — professional personas that shape how each agent thinks. The Planner runs in “Eng Manager cognitive mode.” The Architect runs in “paranoid review mode.” These aren’t just labels. They’re entire professional reasoning frameworks.

The Eng Manager mode in the Planner prompt is explicit about what it requires: “Lock in execution, open on planning. Before making any code changes or task breakdowns, you MUST lock in the technical spine.” It tells the agent what to prioritize (architecture, data flow, state transitions, failure modes) and what to ignore (“DO NOT imagine ‘cool features’”).

The Architect prompt tells the agent what not to look for at the plan review stage: N+1 queries, SQL injection, race conditions — those are code-level concerns that don’t apply to plan review. This distinction sounds obvious when stated explicitly, but without it, the Architect agent was presumably flagging implementation concerns that were premature.

The gstack enhancement essentially gave each agent a professional conscience. The Planner thinks like an engineering manager. The Architect thinks like someone who has seen too many production incidents. These aren’t stylistic choices — they’re functional requirements for a system where agents need to push back on each other in productive ways.

The Craft of Writing Prompts That Actually Work

What strikes me about this prompt engineering discipline is that it’s treated as a craft, not a one-time task. The prompts have evolved over months — the git history shows incremental refinements, each addressing a specific failure mode the previous version didn’t cover.

The commit fix: architect prompt - plan review not code review exists because the Architect agent was doing the wrong kind of review. The commit fix: developer code template - escape double braces exists because template rendering was breaking. These aren’t AI failures — they’re prompt failures. The model was doing exactly what it was told, but what it was told wasn’t precise enough.

Writing a prompt that works is an iterative engineering process:

  • You write v1
  • The agent does something unexpected
  • You identify the gap in the prompt
  • You write v2 with more specificity
  • The agent still does something unexpected, but different
  • You iterate again

287 lines for a Planner prompt sounds excessive until you consider that each line is there because a previous version didn’t cover a real failure mode. The lines aren’t vanity — they’re accumulated learning.

What This Teaches About Agentic Systems

The honest lesson from reading these prompts is that the quality of a multi-agent system is determined before a single agent runs. It’s determined by the precision of the specifications that define each agent’s role, the clarity of the handoff protocols between agents, and the rigor of the validation steps between stages.

The agents in the ACO system aren’t magic. They’re sophisticated pattern matchers running against carefully engineered input specifications. The “intelligence” isn’t in the model — it’s in the engineering of what the model is asked to do.

This is a different mental model than most people start with. Most people think: “if we just had a smarter model, the agents would be better.” The reality is more interesting: the agents are as good as the prompts that define them, and prompt engineering is itself a discipline that requires the same rigor as any other engineering discipline.

The next time you find yourself debugging a multi-agent system, the question to ask isn’t “why is the agent failing?” It’s “what did we tell the agent to do, and is that specification precise enough to produce the behavior we want?”

In Aniket’s ACO system, the answer to that question lives in 1,291 lines of agent prompts, four committed revisions, and a philosophy that treats prompts as first-class engineering artifacts.

That’s the discipline.

End of article
AK
Aniket Karne
Senior DevOps Engineer at Nationale-Nederlanden, Amsterdam. Building with AI agents, Kubernetes, and cloud infrastructure. Writing about what's actually being built.

Enjoyed this? Give it some claps

Newsletter

Stay in the loop

New posts drop when there's something worth writing about. No spam — just the occasional deep dive from the workbench.

Or follow on Substack directly

Share:

Comments

Written by Aniket Karne

April 5, 2026 at 12:00 AM UTC