Image Gen

The Image Gen MCP Server connects your AI assistants with modern image generation models like OpenAI’s gpt-image-1 and Gemini imagen-4.

It serves as a universal translator for image requests. Your MCP client sends a standardized request, and the server handles the job of talking to the specific image generation model’s API, then sends the picture back.

Features

  • Multi-Provider Support – Access to OpenAI (gpt-image-1, DALL-E 3, DALL-E 2) and Google Gemini (Imagen-4, Imagen-4-Ultra, Imagen-3) models.
  • MCP Compatibility – Works with any MCP clients, including Claude Desktop, Cursor AI, and Zed Editor.
  • Image Editing – Edit existing images with text instructions using mask-based editing.
  • Built-in Prompt Templates – Pre-optimized templates for social media, blog headers, product photography, and more.
  • Smart Storage Management – Organized local storage with automatic cleanup and metadata tracking.
  • Production-Ready Deployment – Docker support with monitoring, reverse proxy, and SSL configuration.
  • Multiple Transport Options – STDIO for desktop integration, HTTP for web deployment, SSE for real-time applications.
  • Resource Management – MCP resources for image history, storage statistics, and model information.
  • Intelligent Caching – Memory and Redis-based caching with TTL and cleanup policies.
  • Error Handling – Rate limiting, retry logic, and detailed logging for production reliability.

Use Cases

  • On-the-Fly Content Creation: A blogger or social media manager can generate custom illustrations and graphics directly inside their writing or chat interface.
  • Rapid Prototyping for Developers: A frontend developer can ask for placeholder images or concept art right from their code editor.
  • Enhanced Brainstorming Sessions: A UI/UX designer collaborating with a product manager can instantly visualize ideas and concepts discussed in a chat.
  • Custom Documentation Visuals: A technical writer can create custom diagrams and technical illustrations for their guides without needing separate design software.

How to Use It

1. Install Python 3.10+ and UV package manager, then clone the repository:

git clone <repository-url>
cd image-gen-mcp
uv sync

2. Configure your environment by copying .env.example to .env and adding your API keys:

cp .env.example .env

3. Edit the .env file to include:

  • PROVIDERS__OPENAI__API_KEY for OpenAI models
  • PROVIDERS__GEMINI__API_KEY for Gemini models (optional)

4. Running the MCP Server:

Development Mode:

# HTTP transport for web development
./run.sh dev
# STDIO transport for Claude Desktop integration
./run.sh stdio
# With development tools including Redis Commander
./run.sh dev --tools

Manual Execution:

# Default STDIO transport
uv run python -m gpt_image_mcp.server
# HTTP transport with custom port
uv run python -m gpt_image_mcp.server --transport streamable-http --port 3001
# SSE transport for real-time applications
uv run python -m gpt_image_mcp.server --transport sse --port 8080

5. MCP client Integration.

Claude Desktop Configuration:

{
  "mcpServers": {
    "image-gen-mcp": {
      "command": "uv",
      "args": ["--directory", "/path/to/image-gen-mcp", "run", "image-gen-mcp"],
      "env": {
        "OPENAI_API_KEY": "your-api-key-here"
      }
    }
  }
}

Continue.dev Integration:

{
"mcpServers": {
"gpt-image": {
"command": "uv",
"args": ["--directory", "/path/to/image-gen-mcp", "run", "image-gen-mcp"],
"env": {
"OPENAI_API_KEY": "your-api-key-here"
}
}
}
}

Available Tools and Parameters

list_available_models

  • Returns dictionary with model information, capabilities, and provider details
  • No parameters required

generate_image

  • prompt (required): Text description of desired image
  • model (optional): Specific model to use (e.g., “gpt-image-1”, “dall-e-3”, “imagen-4”)
  • quality: “auto” | “high” | “medium” | “low” (default: “auto”)
  • size: “1024×1024” | “1536×1024” | “1024×1536” (default: “1536×1024”)
  • style: “vivid” | “natural” (default: “vivid”)
  • output_format: “png” | “jpeg” | “webp” (default: “png”)
  • background: “auto” | “transparent” | “opaque” (default: “auto”)

edit_image

  • image_data (required): Base64 encoded image or data URL
  • prompt (required): Edit instructions
  • mask_data (optional): Mask for targeted editing
  • Same size, quality, and format options as generate_image

Available Resources

  • generated-images://{image_id} – Access specific generated images
  • image-history://recent – Browse recent generation history
  • storage-stats://overview – Storage usage and statistics
  • model-info://gpt-image-1 – Model capabilities and pricing information

Prompt Templates

  • Creative Image: Artistic image generation
  • Product Photography: Commercial product images
  • Social Media Graphics: Platform-optimized posts
  • Blog Headers: Article header images
  • OG Images: Social media preview images
  • Video Thumbnails: YouTube/video thumbnails

FAQs

Q: Can I use multiple AI providers simultaneously?
A: Yes, the server supports both OpenAI and Google Gemini providers simultaneously. You can specify which model to use for each generation request, or let the server use the default model. Each provider requires its own API key configuration.

Q: How does the image storage and caching system work?
A: Images are stored locally in an organized directory structure with metadata. The server provides both immediate base64 data and persistent resource URIs for accessing images. Caching supports both memory-based and Redis backends with configurable TTL and cleanup policies.

Q: Can I deploy this in a production environment?
A: Yes, the server includes comprehensive production deployment options with Docker containers, reverse proxy configuration, SSL support, monitoring dashboards, and rate limiting. It supports multiple transport methods including HTTP for web deployment and STDIO for desktop integration.

Q: How do I handle API rate limits and errors?
A: The server includes built-in retry logic, rate limiting protection, and comprehensive error handling. It automatically handles temporary API failures and provides detailed error logging for troubleshooting production issues.

Latest MCP Servers

Google Meta Ads GA4

An MCP server that connects AI assistants to Google Ads, Meta Ads, and GA4 for reporting, edits, and cross-platform analysis.

Token Savior Recall

An MCP server that reduces AI token usage by 97% through structural code indexing and provides persistent memory across coding sessions.

Notion

Notion's official MCP Server allows you to interact with Notion workspaces through the Notion API.

View More MCP Servers >>

Featured MCP Servers

Notion

Notion's official MCP Server allows you to interact with Notion workspaces through the Notion API.

Claude Peers

An MCP server that enables Claude Code instances to discover each other and exchange messages instantly via a local broker daemon with SQLite persistence.

Excalidraw

Excalidraw's official MCP server that streams interactive hand-drawn diagrams to Claude, ChatGPT, and VS Code with smooth camera control and fullscreen editing.

More Featured MCP Servers >>

FAQs

Q: What exactly is the Model Context Protocol (MCP)?

A: MCP is an open standard, like a common language, that lets AI applications (clients) and external data sources or tools (servers) talk to each other. It helps AI models get the context (data, instructions, tools) they need from outside systems to give more accurate and relevant responses. Think of it as a universal adapter for AI connections.

Q: How is MCP different from OpenAI's function calling or plugins?

A: While OpenAI's tools allow models to use specific external functions, MCP is a broader, open standard. It covers not just tool use, but also providing structured data (Resources) and instruction templates (Prompts) as context. Being an open standard means it's not tied to one company's models or platform. OpenAI has even started adopting MCP in its Agents SDK.

Q: Can I use MCP with frameworks like LangChain?

A: Yes, MCP is designed to complement frameworks like LangChain or LlamaIndex. Instead of relying solely on custom connectors within these frameworks, you can use MCP as a standardized bridge to connect to various tools and data sources. There's potential for interoperability, like converting MCP tools into LangChain tools.

Q: Why was MCP created? What problem does it solve?

A: It was created because large language models often lack real-time information and connecting them to external data/tools required custom, complex integrations for each pair. MCP solves this by providing a standard way to connect, reducing development time, complexity, and cost, and enabling better interoperability between different AI models and tools.

Q: Is MCP secure? What are the main risks?

A: Security is a major consideration. While MCP includes principles like user consent and control, risks exist. These include potential server compromises leading to token theft, indirect prompt injection attacks, excessive permissions, context data leakage, session hijacking, and vulnerabilities in server implementations. Implementing robust security measures like OAuth 2.1, TLS, strict permissions, and monitoring is crucial.

Q: Who is behind MCP?

A: MCP was initially developed and open-sourced by Anthropic. However, it's an open standard with active contributions from the community, including companies like Microsoft and VMware Tanzu who maintain official SDKs.

Get the latest & top AI tools sent directly to your email.

Subscribe now to explore the latest & top AI tools and resources, all in one convenient newsletter. No spam, we promise!