Consensus Tool - Multi-Model Consensus Through Structured Debate

Gather diverse perspectives from multiple AI models to reach informed decisions through structured debate

The consensus tool orchestrates multiple AI models with assigned stances to provide diverse perspectives on proposals, enabling structured decision-making through for/against/neutral analysis and synthesized recommendations.

Example Prompts

For/Against Debate:

Use bab consensus with flash taking a "for" stance and pro being "against" to evaluate whether
we should migrate from REST to GraphQL for our API

Multi-Model Technical Decision:

Get consensus from o3, flash, and pro on our new authentication architecture. Have o3 focus on
security implications, flash on implementation speed, and pro stay neutral for overall assessment

Architecture Evaluation:

Use consensus with gemini supporting microservices and grok opposing to debate whether we should
decompose our monolith, include the current architecture diagram

Technology Tradeoff:

I need consensus from three models on whether to adopt Bun over Node for our backend services.
Have one model argue for, one against, and one neutral

Plugin Models in Parallel:

Run consensus in parallel using copilot/gemini-2.5-pro (for) and opencode/minimax-m2.5 (against)
on our caching strategy. Enable high thinking mode on both and set parallel: true

How It Works

The consensus tool queries multiple AI models and synthesizes their perspectives:

Define the proposal: Describe the decision or technical question to evaluate
Assign stances: Each model takes a specific viewpoint (for, against, or neutral)
Consult models: Sequential (default) or parallel — sequential builds context as it goes; parallel queries all models concurrently
Synthesize results: All perspectives are combined into a comprehensive recommendation
Multi-step support: Complex evaluations can span multiple steps using step_number and total_steps

Model Routing

The consensus tool supports two model sources:

SDK models — identified by model ID or alias (e.g. gemini-2.5-pro, gpt-4o, claude-opus-4). These must be available via a configured provider key.

Plugin models — identified by pluginId/modelName format (e.g. copilot/gemini-2.5-pro, opencode/minimax-m2.5). These are routed through the installed delegate plugin for that CLI tool.

Key Features

Parallel execution: Set parallel: true to consult all models concurrently via Promise.allSettled — partial failures are recorded, not fatal
Per-model options: Each model entry can specify its own temperature and thinking_mode independently
Stance steering: Assign specific perspectives (for/against/neutral) to each model
Custom stance prompts: Provide specific instructions for how each model should approach the analysis
Sequential context building: Each model sees all previous model responses for richer debate (sequential mode)
Thinking mode control: Set extended reasoning depth per model — minimal, low, medium, high, max
Multi-step evaluation: Break complex decisions into numbered steps for thorough analysis
File context support: Include relevant source files for informed decision-making
Image support: Analyze architectural diagrams, UI mockups, or design documents
Conversation continuation: Build on previous consensus analysis with additional rounds via continuation_id

Tool Parameters

step (string, required): The proposal or decision to evaluate
findings (string, required): Independent findings or context to inform the debate
models (object[], required, min 2): Model configurations, each with:
- model (string): Model ID, alias, or pluginId/modelName (e.g. copilot/gemini-2.5-pro)
- stance (“for” “against” “neutral”, optional): Perspective to take
- stance_prompt (string, optional): Custom instructions for the model’s analysis approach
- temperature (number 0-1, optional): Per-model temperature override
- thinking_mode (“minimal” “low” “medium” “high” “max”, optional): Per-model reasoning depth
step_number (number, required): Current step number (>= 1)
total_steps (number, required): Total number of planned steps (>= 1)
next_step_required (boolean, required): Whether another step follows this one
parallel (boolean, optional): Query all models concurrently instead of sequentially (default: false)
continuation_id (string, optional): Continue a previous consensus discussion
current_model_index (number, optional, default 0): Which model to consult next in the sequence
model_responses (object[], optional): Prior model responses from earlier in the sequence, each with:
- model: Model identifier
- provider: Provider name
- response: The model’s response text
- stance: The stance the model took
temperature (number, optional): Global response creativity, 0-1 (overridden by per-model temperature)
relevant_files (string[], optional): Absolute paths to files for additional context
images (string[], optional): Absolute paths to images for visual context

Usage Examples

Architecture Decision:

"Get consensus from pro and o3 on whether to use microservices vs monolith for our e-commerce platform"

Technology Migration:

"Use consensus with flash supporting and pro opposing to evaluate migrating from MySQL to PostgreSQL"

Design Tradeoff:

"Have three models debate whether we should use server-side rendering or client-side rendering for our dashboard"

With File Context:

"Use consensus to evaluate our current caching strategy - include src/cache/ and have models debate whether to switch to Redis"

Best Practices

Use at least two models: The tool requires a minimum of two models for meaningful debate
Balance stances: Mix supportive and critical perspectives for thorough analysis
Provide detailed findings: The findings parameter gives models essential context for informed opinions
Use custom stance prompts: Guide models to focus on specific aspects (security, performance, cost)
Include relevant files: Provide code or documentation so models can ground their analysis in reality
Use multi-step for complex decisions: Break large evaluations into sequential steps
Build on discussions: Use continuation_id for follow-up rounds of analysis

When to Use Consensus vs Other Tools

Use consensus for: Multi-perspective analysis, structured debates, major architectural decisions, technology evaluations, complex tradeoffs requiring diverse viewpoints
Use chat for: Single-model conversations, brainstorming, quick questions
Use thinkdeep for: Single-model deep analysis and extended reasoning on a specific topic
Use challenge for: Critical thinking and stress-testing a specific idea or assumption