Two Repos, Two Philosophies: How AI Trading Bots Are Actually Built — aniketkarneai.com

There’s a pattern emerging in open-source AI: instead of building one smart agent, developers create many specialized agents and let them argue. The most starred examples are in finance — where the organizational structure of a real trading desk has been directly mapped into software.

I cloned two of the most popular multi-agent trading frameworks and spent a week reading the code. They take completely different approaches to the same problem, and the differences reveal a lot about how multi-agent systems actually work — and where they break down.

The two repos:

virattt/ai-hedge-fund — 56K stars. Solo developer. Star investor personas.
TauricResearch/TradingAgents — 52K stars. Research group. Analyst/researcher/trader structure.

Repo 1: AI Hedge Fund — The Celebrity Portfolio

Philosophy: “What would Buffett do?”

The ai-hedge-fund approach is immediately recognizable: 19 agents, each named after a famous investor. The system prompt for the Warren Buffett agent literally says “You are Warren Buffett.” Same for Taleb, Burry, Cathie Wood, and the rest.

Architecture:

User input (tickers) → LangGraph workflow
  → 19 investor agents (run in parallel)
  → Risk Manager agent
  → Portfolio Manager agent
  → Final decision

What it’s really doing:

Each investor agent follows a template — fetch financial data, run quantitative analysis, hand off to the LLM with a persona prompt:

# Simplified from warren_buffett.py
def warren_buffett_agent(state: AgentState, agent_id: str):
    for ticker in tickers:
        metrics = get_financial_metrics(ticker, end_date)
        line_items = search_line_items(ticker, [...], end_date)
        market_cap = get_market_cap(ticker)
        
        # Quantitative analysis
        fundamental_score = analyze_fundamentals(metrics)
        moat_score = analyze_moat(metrics)
        intrinsic_value = calculate_intrinsic_value(line_items)
        
        # LLM call with investor persona
        output = call_llm(prompt=f"You are Warren Buffett. {analysis_data}")

The persona is a thin wrapper around genuine financial analysis. ROE thresholds, margin of safety calculations, DCF models — these are real quantitative tools. The LLM just decides where to apply them.

How agents communicate:

LangGraph’s AgentState carries a shared dictionary forward. Every agent adds its signals to state["data"]["analyst_signals"]. The merge operator means signals accumulate. The Portfolio Manager reads all of them at the end.

Key insight: This is a fan-out pattern. All investor agents run in parallel, each independently. No agent knows what the others said until the Portfolio Manager aggregates.

Repo 2: TradingAgents — The Trading Desk

Philosophy: “What would a well-structured firm do?”

TradingAgents takes a different approach. Instead of celebrity personas, it models the org chart of an actual quant trading firm:

Analyst Team (parallel)
  → Fundamentals Analyst
  → Sentiment Analyst  
  → News Analyst
  → Technical Analyst
  → Risk Analyst

Researcher Team (debate)
  → Bullish Researcher
  → Bearish Researcher
  → (structured debate rounds)

Trader Agent
  → Composes analyst/researcher outputs

Risk Management + Portfolio Manager
  → Final execution decision

What’s different:

Debate rounds: The Bullish and Bearish researchers explicitly argue against each other. This is a multi-turn conversation where one agent challenges the other’s thesis. The number of debate rounds is configurable (max_debate_rounds).
Risk management has veto power: Unlike ai-hedge-fund where the Portfolio Manager has the final word, TradingAgents has a dedicated Risk Management team that evaluates the proposed trade against current portfolio risk metrics. The Portfolio Manager can be overruled.
Memory/reflection: TradingAgents has a reflect_and_remember method that stores past decisions and their outcomes:

ta = TradingAgentsGraph(debug=True, config=config)
_, decision = ta.propagate("NVDA", "2024-05-10")
ta.reflect_and_remember(1000)  # parameter = position returns

Multi-provider LLM support: v0.2.x supports GPT-5.x, Gemini 3.x, Claude 4.x, Grok 4.x, DeepSeek, Qwen, and Azure OpenAI. Switch with a config change.

Side-by-Side Comparison

Aspect	ai-hedge-fund	TradingAgents
Agents	19 investor personas	4 analyst + 2 researcher + trader + risk
Architecture	Fan-out, parallel	Sequential pipeline with debate
Debate	None	Bullish vs Bearish researchers
Risk veto	No	Yes, risk team can block trades
Memory	No	Yes, reflect_and_remember
Tech stack	LangGraph	LangGraph + custom
LLM providers	OpenAI, Anthropic, Groq, Ollama	GPT, Gemini, Claude, Grok, DeepSeek, Azure
Backtesting	Yes (engine in `src/backtesting/`)	Not in core
License	None	Apache 2.0
Contributors	1 (742 commits)	~5 active

The Common Thread

Despite their differences, both repos share the same fundamental assumption: the right way to make a trading decision is to model the organizational structure that humans use to make the same decision.

Trading firms have analysts, researchers, traders, and risk managers. These repos replicate that structure in software. The insight isn’t new — it’s how real asset management firms work. But mapping it to multi-agent software is a clean abstraction that makes the code readable and the behavior predictable.

Both repos also share the same failure mode: they confuse the appearance of analysis with actual analysis. 19 agents giving opinions sounds rigorous. A bullish/bearish debate sounds thorough. But if the underlying data is wrong, or the prompts don’t actually distinguish between a good and bad investment, the multi-agent structure just adds latency and cost.

Which Architecture Is Better?

Depends on what you’re optimizing for.

ai-hedge-fund wins on: simplicity, fun, personas that tell a story. Great for demos. Easy to understand. The celebrity angle makes the output entertaining.

TradingAgents wins on: robustness, reviewability, adaptability. The explicit debate and risk veto are features, not overhead. Memory means it improves over time.

For a production trading system, TradingAgents’ architecture is more defensible. The explicit debate structure is interesting for other domains too — code review, architecture decisions, editorial processes.

The insight from both repos: multi-agent systems work best when the organizational structure mirrors the real-world process you’re modeling. The mistake most people make is throwing agents at a problem without thinking about what each agent should specialize in and how decisions should aggregate.

Aniket Karne

DevOps & AI Engineer · Amsterdam

Back to all posts