Skip to main content
Skills

Claude Council: make several AIs deliberate

Karpathy's LLM Council pattern adapted to Claude: make several AIs deliberate, anonymized cross-review, a Chairman's synthesis, and a prompt generator.

  • Guide
  • Architecture
  • Productivity
Published

What is a Council?

When you ask a hard question to a single model, you get a single angle. If that model has a bias on the topic, you inherit the bias without knowing it. The Council idea is simple: gather several opinions, confront them, then decide.

The pattern comes from Andrej Karpathy, who published the karpathy/llm-council repo on November 22, 2025. He describes it himself as code that was "99% vibe coded as a fun Saturday hack": a local web app he built in an afternoon to read books while comparing several LLMs side by side. The repo went viral (as of 2026-06-02, roughly 19,900 stars and 3,760 forks, figures worth re-checking on GitHub as they move fast), but it remains a personal prototype.

The workflow runs in three stages. The names and behavior below are taken directly from the repo's README.

1

First opinions

Your question goes to each model in the council, individually. Each one answers without seeing the others. You collect as many raw answers as there are advisors, shown side by side. That alone is useful: seeing several independent answers quickly reveals where they agree and where they clash.

2

Review (anonymized)

Each model then receives the others' answers, but the identities are hidden. Nobody knows who wrote what. That anonymity is deliberate: it stops a model from favoring itself or a brand. Each advisor then ranks the answers on two axes, accuracy and insight.

3

Final response

A designated model, the Chairman, receives every answer and every ranking, then writes a single synthesis. That final answer is what you see. The Chairman doesn't just copy the best opinion: it arbitrates the disagreements and combines what holds up.

A detail that matters for a "Claude" site: in the original version, Claude is just one advisor out of four, and Gemini is the one who presides. Here is the default configuration, copied verbatim from backend/config.py:

COUNCIL_MODELS = [
"openai/gpt-5.1",
"google/gemini-3-pro-preview",
"anthropic/claude-sonnet-4.5",
"x-ai/grok-4",
]
CHAIRMAN_MODEL = "google/gemini-3-pro-preview"

The identifiers are in OpenRouter format, because the repo routes every call through OpenRouter (ouvre un nouvel onglet) (one API key, several providers). On the stack side, it's FastAPI (Python 3.10+) with httpx for the backend, React + Vite for the front, and JSON files for storage. Nothing hosted: it's a local app you run on your own machine.

Multi-model or multi-persona?

The word "Council" covers two very different setups. Confusing them is the classic mistake, so let's put them side by side.

ApproachPrincipleStrengthsLimits
Multi-model (Karpathy style)N models from different providers (GPT, Gemini, Claude, Grok) deliberateGenuine diversity of angles, decorrelated biases across modelsCost and latency (N providers), OpenRouter key, uneven quality from one model to the next
Multi-persona of a single ClaudeOne Claude plays several roles that contradict each otherSimple (one provider, one key), reproducible, packageable as a skillCorrelated biases (same underlying model), risk of echo rather than real contradiction

On Claude, you most often go multi-persona: you ask the same model to play a skeptic, an optimist, a domain expert in turn, then act as the Chairman. It's simpler to set up and fits in a single prompt or a single skill. In exchange, you lose true independence of opinions: since it's the same brain behind every persona, the blind spots are shared. A well-written "contrarian" persona helps, but it doesn't replace a model trained differently.

Multi-model offers more authentic diversity, at the price of a stack to build and a bill that multiplies. Choose based on what's at stake.

Anatomy of a council

A council, whether multi-model or multi-persona, rests on three ingredients.

The advisors. In multi-model, these are distinct models. In multi-persona, they are roles you define. As an illustration, you might picture a panel like a skeptic who hunts for flaws, an advocate who defends the option, a practitioner who thinks about real execution, and a generalist who zooms out. These are only examples: the right personas depend entirely on your question. Three to five almost always suffice.

The anonymized cross-review. This is the step that separates a real council from simply "asking the question three times." Each opinion is submitted to the others with no author label, and each one must judge it on the merits. Anonymity prevents complacency.

The Chairman. It has the last word. Its job is not to vote but to synthesize: spot the agreements, surface the disagreements, and produce a reasoned recommendation. In multi-persona, it's often the same Claude switching hats one final time.

Demo: the generator

Rather than copying FastAPI code, here is a generator that produces the text you need. Pick the mode, the number of advisors, their personas and the Chairman: you get a prompt ready to paste into Claude, and a SKILL.md skeleton if you want to turn it into a reusable skill. Everything runs in your browser, nothing is sent anywhere.

Council generator

Pick the mode, the advisors and the Chairman. Everything is generated in your browser, no request is ever sent.

Mode
Number of advisors
3≈ 7 calls per question
Advisors
You will play a council of several experts deliberating on my question. Take on each role below in turn, then synthesize the answer as the Chairman.

## The advisors
1. Skeptic : hunts for flaws, risks and unverified assumptions.
2. Advocate : defends the most promising option and shows its value.
3. Practitioner : thinks about real execution, cost and constraints.

## Stage 1: First opinions
Each advisor answers the question independently, without seeing the other answers.

## Stage 2: Review (anonymized)
Show each advisor the other answers without revealing who wrote them, and ask them to rank the answers on accuracy and insight.

## Stage 3: Final response
The designated Chairman is: Skeptic.
The Chairman reads every opinion and every ranking, arbitrates the disagreements and produces a single, reasoned recommendation.

## Question
(replace with your question)

The generated prompt is fine for one-off use: you paste it, Claude plays the roles in a single conversation. The SKILL.md is for when you want to trigger this behavior repeatedly, without re-pasting the prompt every time. To understand how Claude loads a skill, see What is a skill?.

When to use it, when it's a waste

A council is expensive. Expect on the order of 2N + 1 calls per question (N first opinions, N cross-reviews, 1 Chairman synthesis). For four advisors, that's nine calls where a simple question takes one. Latency follows the same curve.

That overhead is only worth it for a real stake, with no obvious right answer, where several angles genuinely shed light on the decision.

Next steps

  • What is a skill? to understand how to package this pattern and trigger it automatically.
  • find-skills to check whether someone already published a council skill before writing your own.
  • Orchestrating agents if you want to go beyond a single prompt and run the advisors as separate agents.