Zen

The Zen MCP Server enables Claude to connect with and direct other AI models, including Google’s Gemini and OpenAI’s o3.

This turns Claude into an orchestrator who manages a team of specialized AIs to assist with complex development tasks.

You can use it for in-depth code analysis, collaborative problem-solving, and getting multiple AI perspectives, all within a single, continuous conversation.

Features

  • 🤖 Multi-model AI orchestration – Claude automatically selects or manually assigns the best AI model for each development task
  • 🧠 Extended thinking capabilities – Access Gemini’s specialized thinking models with configurable token budgets for deep analysis
  • 🔍 Professional code review – Comprehensive code analysis with severity-based issue prioritization across entire repositories
  • 🐛 Expert debugging assistance – Root cause analysis with multiple ranked hypotheses and systematic error investigation
  • 📊 Smart file analysis – Architecture analysis, performance evaluation, and security assessment for files and directories
  • Pre-commit validation – Multi-repository change validation with requirement verification and regression detection
  • 💬 AI-to-AI conversations – Persistent conversation threading allowing models to question each other and build on previous exchanges
  • 🔄 Cross-tool continuation – Seamlessly continue conversations across different tools while preserving full context
  • 🌐 Web search integration – Models can recommend specific searches for current documentation and best practices
  • Large prompt handling – Automatically bypasses MCP’s 25K token limit for complex analysis requests

Use Cases

  • Complex debugging workflows where you need O3’s logical reasoning to analyze error patterns, Gemini Pro’s deep thinking for architectural issues, and Flash for quick validation checks, all within a single conversation thread
  • Comprehensive code review processes where different AI models contribute specialized perspectives – Gemini Pro for security analysis, O3 for logic verification, and Flash for style consistency – with each model building on previous findings
  • Architecture design validation where you can brainstorm with one model, get critical analysis from another, and validate implementation approaches with a third, maintaining full context across all exchanges
  • Multi-repository maintenance where you need to analyze changes across multiple codebases, validate against requirements, and ensure no regressions are introduced before committing

How To Use It

1. Clone the repository and run the setup script. This builds the necessary Docker images, creates a .env file for your API keys, and starts the required services.

# Clone the repository
git clone https://github.com/BeehiveInnovations/zen-mcp-server.git
cd zen-mcp-server
# Run the one-command setup
./setup-docker.sh

2. Add your API keys to the newly created .env file.

# Edit the .env file
nano .env
# Add your keys
# GEMINI_API_KEY=your-gemini-api-key
# OPENAI_API_KEY=your-openai-api-key
# OPENROUTER_API_KEY=your-openrouter-key

3. Connect to Claude.

For Claude Code (CLI):

# Add the Zen MCP server to Claude Code
claude mcp add zen -s user -- docker exec -i zen-mcp-server python server.py
# Verify the server is listed
claude mcp list

For Claude Desktop:

{
"mcpServers": {
"zen": {
"command": "docker",
"args": [ "exec", "-i", "zen-mcp-server", "python", "server.py" ]
}
}
}

4. You can now invoke the Zen MCP server in your prompts using natural language.

Usage Examples

Basic Commands:

"Think deeper about this architecture design with zen"
"Use zen to perform a code review of auth.py for security issues"
"Debug this test failure with zen, the bug might be in my_class.swift"
"Use flash to quickly format this code based on policy.md"
"Get o3 to debug this logic error in checkOrders() function"

Multi-Model Workflows:

"Study the code, pick a scaling strategy and debate with pro to settle on two best approaches. Then get o3 to validate the logic of your top choice."
"Implement a new authentication system. Get a codereview from gemini pro and o3, then fix critical issues and perform a precommit check."

Available Tools

chat – Collaborative thinking and development discussions

  • message: Your question or topic (required)
  • model: auto|pro|flash|o3|o3-mini (default: server default)
  • files: Optional file paths for context
  • thinking_mode: minimal|low|medium|high|max (Gemini only)

thinkdeep – Extended reasoning for complex problems

  • current_analysis: Your current thinking (required)
  • model: auto|pro|flash|o3|o3-mini (default: server default)
  • problem_context: Additional context
  • focus_areas: Specific aspects to analyze
  • files: File paths for context
  • thinking_mode: minimal|low|medium|high|max (default: max)

codereview – Professional code review with severity levels

  • files: File paths or directories (required)
  • model: auto|pro|flash|o3|o3-mini (default: server default)
  • review_type: full|security|performance|quick
  • focus_on: Specific review aspects
  • standards: Coding standards to enforce
  • severity_filter: critical|high|medium|all

precommit – Pre-commit validation across repositories

  • path: Starting directory (default: current)
  • original_request: Requirements context
  • model: auto|pro|flash|o3|o3-mini (default: server default)
  • compare_to: Compare against branch/tag
  • review_type: full|security|performance|quick
  • severity_filter: Issue severity filter

debug – Expert debugging with root cause analysis

  • error_description: Issue description (required)
  • model: auto|pro|flash|o3|o3-mini (default: server default)
  • error_context: Stack traces or logs
  • files: Related file paths
  • runtime_info: Environment details
  • previous_attempts: What you’ve tried

analyze – File and architecture analysis

  • files: File paths or directories (required)
  • question: What to analyze (required)
  • model: auto|pro|flash|o3|o3-mini (default: server default)
  • analysis_type: architecture|performance|security|quality|general
  • output_format: summary|detailed|actionable

Model Selection

Auto Mode (Recommended):

Set DEFAULT_MODEL=auto and Claude automatically picks the optimal model for each task.

Manual Selection:

  • “Use pro for deep security analysis”
  • “Use flash for quick formatting check”
  • “Use o3 to debug this logic error”
  • “Use o3-mini for balanced analysis”

Thinking Modes

Thinking Modes are specific to Gemini models and let you control the trade-off between response quality and API cost.

A low thinking mode uses fewer tokens for a faster, cheaper response on simple tasks, while a high or max mode uses more tokens for a deeper, more thorough analysis of complex problems.

  • minimal (128 tokens) – Simple tasks
  • low (2,048 tokens) – Basic reasoning
  • medium (8,192 tokens) – Default for most tasks
  • high (16,384 tokens) – Complex problems
  • max (32,768 tokens) – Exhaustive analysis

FAQs

Q: What is the main advantage of using Zen MCP instead of just Claude?
A: The primary advantage is AI orchestration. Zen MCP allows Claude to act as a team lead, delegating tasks to other AI models like Gemini or O3 that may have different strengths, such as a larger context window for analysis or superior logical reasoning for debugging. This happens within one continuous conversation.

Q: How does auto mode decide which model to use?
A: Claude analyzes the request complexity, task type, and model strengths. Complex architecture gets Gemini Pro, quick checks get Flash, logical debugging gets O3. You can always override with specific model names.

Q: Can I use multiple API providers simultaneously?
A: Yes, but if you have both native APIs (Gemini, OpenAI) and OpenRouter configured for the same models, native APIs take priority. This prevents ambiguity about which provider serves each model.

Q: What happens when I hit MCP’s 25K token limit?
A: The server automatically detects large prompts and asks Claude to save them as files instead. This bypasses MCP limits and uses Gemini’s full 1M token context for analysis.

Q: Can I customize the system prompts for different tools?
A: Yes, all system prompts are defined in prompts/tool_prompts.py. You can modify them globally or override the get_system_prompt() method for specific tools.

Latest MCP Servers

Code Execution Mode

Reduce MCP context overhead from 30k to 200 tokens. This bridge enables secure, rootless Python code execution and on-demand tool discovery for Claude.

WPMCP

Use the WordPress MCP Server to give AI assistants full control over your site for content, theme, and plugin management.

MATLAB

Install and configure the MATLAB MCP Server for a direct connection between your local MATLAB session and MCP applications like Claude Desktop.

View More MCP Servers >>

Featured MCP Servers

Monday.com

Use the monday.com MCP server to connect AI agents to your Work OS. Provides secure data access, action tools, and workflow context.

MongoDB

Install the MongoDB MCP Server to query databases and manage Atlas directly from your AI assistant in VS Code, Cursor, and more.

CSS

Connect your AI assistant to the CSS MCP Server to get instant MDN docs and analyze project-wide stylesheet complexity.

More Featured MCP Servers >>

FAQs

Q: What exactly is the Model Context Protocol (MCP)?

A: MCP is an open standard, like a common language, that lets AI applications (clients) and external data sources or tools (servers) talk to each other. It helps AI models get the context (data, instructions, tools) they need from outside systems to give more accurate and relevant responses. Think of it as a universal adapter for AI connections.

Q: How is MCP different from OpenAI's function calling or plugins?

A: While OpenAI's tools allow models to use specific external functions, MCP is a broader, open standard. It covers not just tool use, but also providing structured data (Resources) and instruction templates (Prompts) as context. Being an open standard means it's not tied to one company's models or platform. OpenAI has even started adopting MCP in its Agents SDK.

Q: Can I use MCP with frameworks like LangChain?

A: Yes, MCP is designed to complement frameworks like LangChain or LlamaIndex. Instead of relying solely on custom connectors within these frameworks, you can use MCP as a standardized bridge to connect to various tools and data sources. There's potential for interoperability, like converting MCP tools into LangChain tools.

Q: Why was MCP created? What problem does it solve?

A: It was created because large language models often lack real-time information and connecting them to external data/tools required custom, complex integrations for each pair. MCP solves this by providing a standard way to connect, reducing development time, complexity, and cost, and enabling better interoperability between different AI models and tools.

Q: Is MCP secure? What are the main risks?

A: Security is a major consideration. While MCP includes principles like user consent and control, risks exist. These include potential server compromises leading to token theft, indirect prompt injection attacks, excessive permissions, context data leakage, session hijacking, and vulnerabilities in server implementations. Implementing robust security measures like OAuth 2.1, TLS, strict permissions, and monitoring is crucial.

Q: Who is behind MCP?

A: MCP was initially developed and open-sourced by Anthropic. However, it's an open standard with active contributions from the community, including companies like Microsoft and VMware Tanzu who maintain official SDKs.

Get the latest & top AI tools sent directly to your email.

Subscribe now to explore the latest & top AI tools and resources, all in one convenient newsletter. No spam, we promise!