Two Repos, Two Philosophies: How AI Trading Bots Are Actually Built
A side-by-side code analysis of ai-hedge-fund and TradingAgents — fan-out vs sequential pipeline, celebrity personas vs analyst/researcher/trader structure, and what both approaches reveal about multi-agent system design.
There’s a pattern emerging in open-source AI: instead of building one smart agent, developers create many specialized agents and let them argue. The most starred examples are in finance — where the organizational structure of a real trading desk has been directly mapped into software.
I cloned two of the most popular multi-agent trading frameworks and spent a week reading the code. They take completely different approaches to the same problem, and the differences reveal a lot about how multi-agent systems actually work — and where they break down.
The two repos:
- virattt/ai-hedge-fund — 56K stars. Solo developer. Star investor personas.
- TauricResearch/TradingAgents — 52K stars. Research group. Analyst/researcher/trader structure.
Repo 1: AI Hedge Fund — The Celebrity Portfolio
Philosophy: “What would Buffett do?”
The ai-hedge-fund approach is immediately recognizable: 19 agents, each named after a famous investor. The system prompt for the Warren Buffett agent literally says “You are Warren Buffett.” Same for Taleb, Burry, Cathie Wood, and the rest.
Architecture:
User input (tickers) → LangGraph workflow
→ 19 investor agents (run in parallel)
→ Risk Manager agent
→ Portfolio Manager agent
→ Final decision
What it’s really doing:
Each investor agent follows a template — fetch financial data, run quantitative analysis, hand off to the LLM with a persona prompt:
# Simplified from warren_buffett.py
def warren_buffett_agent(state: AgentState, agent_id: str):
for ticker in tickers:
metrics = get_financial_metrics(ticker, end_date)
line_items = search_line_items(ticker, [...], end_date)
market_cap = get_market_cap(ticker)
# Quantitative analysis
fundamental_score = analyze_fundamentals(metrics)
moat_score = analyze_moat(metrics)
intrinsic_value = calculate_intrinsic_value(line_items)
# LLM call with investor persona
output = call_llm(prompt=f"You are Warren Buffett. {analysis_data}")
The persona is a thin wrapper around genuine financial analysis. ROE thresholds, margin of safety calculations, DCF models — these are real quantitative tools. The LLM just decides where to apply them.
How agents communicate:
LangGraph’s AgentState carries a shared dictionary forward. Every agent adds its signals to state["data"]["analyst_signals"]. The merge operator means signals accumulate. The Portfolio Manager reads all of them at the end.
Key insight: This is a fan-out pattern. All investor agents run in parallel, each independently. No agent knows what the others said until the Portfolio Manager aggregates.
Repo 2: TradingAgents — The Trading Desk
Philosophy: “What would a well-structured firm do?”
TradingAgents takes a different approach. Instead of celebrity personas, it models the org chart of an actual quant trading firm:
Analyst Team (parallel)
→ Fundamentals Analyst
→ Sentiment Analyst
→ News Analyst
→ Technical Analyst
→ Risk Analyst
Researcher Team (debate)
→ Bullish Researcher
→ Bearish Researcher
→ (structured debate rounds)
Trader Agent
→ Composes analyst/researcher outputs
Risk Management + Portfolio Manager
→ Final execution decision
What’s different:
-
Debate rounds: The Bullish and Bearish researchers explicitly argue against each other. This is a multi-turn conversation where one agent challenges the other’s thesis. The number of debate rounds is configurable (
max_debate_rounds). -
Risk management has veto power: Unlike ai-hedge-fund where the Portfolio Manager has the final word, TradingAgents has a dedicated Risk Management team that evaluates the proposed trade against current portfolio risk metrics. The Portfolio Manager can be overruled.
-
Memory/reflection: TradingAgents has a
reflect_and_remembermethod that stores past decisions and their outcomes:
ta = TradingAgentsGraph(debug=True, config=config)
_, decision = ta.propagate("NVDA", "2024-05-10")
ta.reflect_and_remember(1000) # parameter = position returns
- Multi-provider LLM support: v0.2.x supports GPT-5.x, Gemini 3.x, Claude 4.x, Grok 4.x, DeepSeek, Qwen, and Azure OpenAI. Switch with a config change.
Side-by-Side Comparison
| Aspect | ai-hedge-fund | TradingAgents |
|---|---|---|
| Agents | 19 investor personas | 4 analyst + 2 researcher + trader + risk |
| Architecture | Fan-out, parallel | Sequential pipeline with debate |
| Debate | None | Bullish vs Bearish researchers |
| Risk veto | No | Yes, risk team can block trades |
| Memory | No | Yes, reflect_and_remember |
| Tech stack | LangGraph | LangGraph + custom |
| LLM providers | OpenAI, Anthropic, Groq, Ollama | GPT, Gemini, Claude, Grok, DeepSeek, Azure |
| Backtesting | Yes (engine in src/backtesting/) | Not in core |
| License | None | Apache 2.0 |
| Contributors | 1 (742 commits) | ~5 active |
The Common Thread
Despite their differences, both repos share the same fundamental assumption: the right way to make a trading decision is to model the organizational structure that humans use to make the same decision.
Trading firms have analysts, researchers, traders, and risk managers. These repos replicate that structure in software. The insight isn’t new — it’s how real asset management firms work. But mapping it to multi-agent software is a clean abstraction that makes the code readable and the behavior predictable.
Both repos also share the same failure mode: they confuse the appearance of analysis with actual analysis. 19 agents giving opinions sounds rigorous. A bullish/bearish debate sounds thorough. But if the underlying data is wrong, or the prompts don’t actually distinguish between a good and bad investment, the multi-agent structure just adds latency and cost.
Which Architecture Is Better?
Depends on what you’re optimizing for.
ai-hedge-fund wins on: simplicity, fun, personas that tell a story. Great for demos. Easy to understand. The celebrity angle makes the output entertaining.
TradingAgents wins on: robustness, reviewability, adaptability. The explicit debate and risk veto are features, not overhead. Memory means it improves over time.
For a production trading system, TradingAgents’ architecture is more defensible. The explicit debate structure is interesting for other domains too — code review, architecture decisions, editorial processes.
The insight from both repos: multi-agent systems work best when the organizational structure mirrors the real-world process you’re modeling. The mistake most people make is throwing agents at a problem without thinking about what each agent should specialize in and how decisions should aggregate.
Enjoyed this? Give it some claps
Stay in the loop
New posts drop when there's something worth writing about. No spam — just the occasional deep dive from the workbench.
Or follow on Substack directly
Comments
Written by Aniket Karne
April 22, 2026 at 12:00 AM UTC