Claude Code Proxy: Use Claude Code CLI with Any OpenAI-Compatible API

Connect Claude Code CLI to OpenAI, local models, and other AI providers with this free proxy server. Save costs and gain provider flexibility.

Claude Code Proxy is a free, open-source proxy server that converts Claude API requests to OpenAI API calls.

It translates API requests from Anthropic’s format to the OpenAI standard, which lets you use a wide range of Large Language Models (LLMs) while keeping the Claude Code workflow you’re used to.

This is for developers who like the Claude CLI but want the flexibility to use other models, like OpenAI’s GPT series, models on Microsoft Azure, or even models running locally on their machine.

Features

  • Full Claude API Compatibility – Complete /v1/messages endpoint support with proper request/response conversion
  • Multiple Provider Support – Works with OpenAI, Azure OpenAI, Ollama local models, and any OpenAI-compatible API
  • Smart Model Mapping – Configure BIG and SMALL models via environment variables for different Claude model types
  • Function Calling Support – Complete tool use support with proper conversion between Claude and OpenAI formats
  • Streaming Responses – Real-time Server-Sent Events (SSE) streaming for live response generation
  • Image Processing – Base64 encoded image input support for multimodal interactions
  • Robust Error Handling – Comprehensive error handling and detailed logging for troubleshooting
  • High Performance – Async/await architecture with connection pooling and configurable timeouts

Use Cases

  • Cost Optimization – Route Claude Code requests through cheaper OpenAI models or free local models instead of paying Anthropic’s API rates
  • Local Development – Use completely offline models through Ollama while keeping the Claude Code interface you’re familiar with
  • Enterprise Compliance – Connect to approved internal AI endpoints or Azure OpenAI deployments that meet corporate security requirements
  • Provider Redundancy – Switch between multiple AI providers for better uptime and rate limit management
  • Custom Model Testing – Experiment with different language models through the same Claude Code CLI without changing your development workflow

How to Use It

1. Clone the repository from GitHub and install the required Python packages.

# Using UV (recommended)
uv sync
# Or using pip
pip install -r requirements.txt

2. Configure Environment Variables

cp .env.example .env
# Edit .env file with your settings

Set up your environment variables based on your target provider:

For OpenAI:

OPENAI_API_KEY="sk-your-openai-key"
OPENAI_BASE_URL="https://api.openai.com/v1"
BIG_MODEL="gpt-4o"
SMALL_MODEL="gpt-4o-mini"

For local models via Ollama:

OPENAI_API_KEY="dummy-key"  # Required but can be any value
OPENAI_BASE_URL="http://localhost:11434/v1"
BIG_MODEL="llama3.1:70b"
SMALL_MODEL="llama3.1:8b"

3. Start the server once your configuration is saved:

# Direct execution
python start_proxy.py
# Or with UV
uv run claude-code-proxy

4. Use Claude Code with the Proxy:

# Temporary usage
ANTHROPIC_BASE_URL=http://localhost:8082 claude
# Or set permanently
export ANTHROPIC_BASE_URL=http://localhost:8082 claude

The proxy automatically maps Claude model requests to your configured models. Claude “haiku” requests route to your SMALL_MODEL, while “sonnet” and “opus” requests go to your BIG_MODEL.

Pros

  • Cost Savings – Use cheaper alternatives to Anthropic’s API pricing
  • Provider Flexibility – Switch between multiple AI providers without changing workflows
  • Local Development – Run completely offline with Ollama models
  • Easy Setup – Simple environment variable configuration
  • Full Feature Support – Maintains streaming, function calling, and image support
  • Performance Optimized – Async architecture handles multiple concurrent requests efficiently
  • Open Source – MIT license allows customization and commercial use

Cons

  • Additional Complexity – Adds another layer between Claude Code and the AI provider
  • Dependency Management – Requires maintaining Python dependencies and proxy server
  • Response Quality Differences – Alternative models may not match Claude’s performance for coding tasks

Related Resources

FAQs

Q: Can I use this with multiple AI providers simultaneously?
A: The current version supports one provider configuration at a time. You’d need to change your environment variables and restart the proxy to switch providers, though you could run multiple proxy instances on different ports.

Q: Is there any performance overhead from using the proxy?
A: The proxy adds minimal latency since it’s built with async/await architecture and connection pooling. Most of the response time comes from the actual AI provider, not the proxy conversion layer.

Q: What happens if the proxy server goes down?
A: Claude Code will fail to connect since it’s routing through the proxy. You’d need to either restart the proxy or temporarily switch back to direct Anthropic API usage by removing the ANTHROPIC_BASE_URL environment variable.

Q: Does this proxy send my code to a third party?
A: The proxy runs on your local machine. It sends API requests from your machine directly to the LLM provider you configure (like OpenAI or your local Ollama instance). Your code doesn’t pass through any extra external servers.

Q: Is Claude Code Proxy an official tool from Anthropic?
A: No, it is a third-party, open-source project created by the community. It is not officially supported by Anthropic.

Leave a Reply

Your email address will not be published. Required fields are marked *

Get the latest & top AI tools sent directly to your email.

Subscribe now to explore the latest & top AI tools and resources, all in one convenient newsletter. No spam, we promise!