Cognitive Modes: The Prompt Pattern Beyond 'System Prompts' — aniketkarneai.com

When you’re building a multi-agent system, the instinct is to reach for a system prompt: “You are a helpful architect agent.” And for a while, that works. But as the system scales, you hit a wall — agents stop producing the kind of output their role demands. The Architect agent gives you hand-wavy suggestions instead of paranoid reviews. The QA agent approves things it shouldn’t.

The ACO system took a different approach. Instead of system prompts, it uses cognitive modes — distinct thinking frameworks baked into each agent’s identity that produce measurably different outputs. After seeing this pattern in action across five agents, I think it’s worth documenting properly.

What a Cognitive Mode Actually Is

A cognitive mode is not a personality tweak. It’s not adding “you are thorough and detail-oriented” to a prompt. It’s a fundamentally different processing framework that changes what the agent pays attention to, what it produces, and what it rejects.

Look at the Planner agent’s eng_manager mode. The frontmatter declares mode: eng_review, and the prompt opens with:

Your output is machine-readable and execution-ready. Every task that leaves your hands will be implemented exactly as specified — no further interpretation needed.

Before writing a single task, the Planner with eng_manager mode is required to lock the technical spine:

Architecture diagram (ASCII, mandatory)
Data flow diagram with nil/empty/invalid/timeout cases
State machine with all triggers
Edge case map with detection → prevention → recovery
Risk table with probability/impact/mitigation

This isn’t guidance. It’s a mandatory processing ritual that produces a completely different kind of task spec than what you’d get from a generic “plan this feature” prompt.

The 9-Field Task Contract

The latest evolution is the 9-field task contract — every task output from the Planner must carry:

title, description — obvious ones
file_path, function_signature — where and how
dependencies — what must come first
acceptance_criteria — what “done” actually means
test_strategy — how you’ll know it works
technical_notes — edge constraints
estimate_hours — for sprint planning

The critical insight: fields like function_signature and acceptance_criteria are not optional. The prompt explicitly states tasks missing these fields are INVALID and will be rejected. This is contract enforcement at the prompt level, not in downstream code.

What Role-Based Modes Produce

Here are the five cognitive modes running in ACO today:

Agent	Mode	Produces
PM	`ceo_founder`	Strategic challenges, 10-star product thinking, premise interrogation
Planner	`eng_review`	Architecture diagrams, data flows, state machines, risk tables
Architect	`paranoid_review`	N+1 queries, race conditions, trust boundary violations
Dev	`release_engineer`	Ship-fast workflow: sync → test → push → PR
QA	`browse_qa`	60-second smoke tests, screenshots, UI verification

Each mode produces outputs the other agents can’t. A paranoid_review Architect will find things a generic Architect won’t — because the mode tells it to specifically hunt for trust boundary violations and N+1 query patterns, not just “review the code.”

Why This Matters More Than Prompt Engineering

Most prompt engineering advice focuses on instruction quality: “be more specific,” “add examples,” “use XML tags.” Cognitive modes go a level deeper. They change what the agent automatically considers before producing output.

A generic Architect prompt might say “think about security and performance.” The paranoid_review mode says: here are the specific categories of production bugs to hunt — N+1 queries, race conditions, trust boundaries, timeout handling — and here is the format to report them.

The mode doesn’t just instruct. It activates a lens.

The Confidence Problem (and Why Integration Tests Matter)

The March 13 commit that introduced these modes noted 85% confidence — “prompts are sound, needs real LLM testing.” That honesty is important. Cognitive modes look correct in review but may produce unexpected behaviors under real LLM inference.

The fix was integration tests: 5/5 tests covering the actual agent output, not just file existence. This is a lesson worth generalizing — when you’re changing how agents think, you need behavioral tests that verify the mode actually produces the expected outputs.

The bigger lesson: cognitive modes are an architectural pattern, not a prompt tweak. They deserve the same rigor — testing, versioning, rollback plans — that you’d apply to any structural change in a production system.

If you’re building multi-agent systems and your agents are producing generic outputs, the problem isn’t your instructions. It’s that your agents need better cognitive modes.

Aniket Karne

DevOps & AI Engineer · Amsterdam

Back to all posts