Windows-MCP
Windows-MCP is an MCP server that gives your AI agents direct control over a Windows operating system.
It translates natural language commands from large language models into actual desktop interactions like clicking, typing, launching apps, and more.
You can use the MCP to automate software testing, perform repetitive UI tasks, or build complex agents that interact with desktop applications.
Key Features
- 🖱️ Desktop Control: Clicks, types, scrolls, and uses keyboard shortcuts.
- 📱 Application Management: Launches programs and manages windows.
- 🔍 State Inspection: Captures a snapshot of the desktop, including active apps and UI elements.
- 💻 Shell Access: Executes PowerShell commands directly.
- 🌐 Web Scraping: Extracts information from web pages.
- ⚡ Low Latency: Actions typically complete in 0.7 to 2.5 seconds.
USE CASES
- Automated QA Testing: Instruct an AI to test a new application build. It can open the software, navigate through menus, input test data, and verify expected behaviors without manual scripting.
- Routine Data Entry: Automate tedious tasks like transferring information from a spreadsheet into a legacy desktop application that lacks a modern API.
- Application Demo Agent: Build an AI that can demonstrate a complex software product live, responding to verbal requests like “show me how to create a new project” and performing the actions in real-time.
- IT Support Scripting: Create an agent that can help troubleshoot common user issues by guiding the AI to check specific system settings or restart services through the GUI.
How to Use It
1. To get started, you need Python 3.13+ and the uv package manager. It’s also recommended to have English as the default language on your Windows machine.
2. Integrate the MCP server with your AI assistangs.
For Claude Desktop:
- Install Claude Desktop and the
@anthropic-ai/dxtnpm package. - Clone the
Windows-MCPrepository from GitHub. - Navigate into the directory and run
npx @anthropic-ai/dxt packto build the extension. - In Claude Desktop, go to
Settings > Extensions > Advance Settings > Install Extensionand select the.dxtfile you just created.
For Perplexity Desktop:
- Install Perplexity Desktop and clone the repository.
- In the app, head to
Settings > Connectors > Add Connector > Advanced. - Name it
Windows-MCPand paste the provided JSON configuration, making sure to update the path to theWindows-MCPdirectory.
For Gemini CLI:
- Install the Gemini CLI and clone the repo.
- Open the
settings.jsonfile located in your%USERPROFILE%/.geminifolder. - Add the
windows-mcpserver configuration, again ensuring the directory path is correct.
For Qwen Code:
- Install Qwen Code and clone the repository.
- Find the
settings.jsonfile in%USERPROFILE%/.qwen. - Add the
windows-mcpserver configuration with the correct path.
For Codex CLI:
- Install the Codex CLI and clone the project.
- Navigate to
%USERPROFILE%/.codex/config.toml. - Add the
windows-mcpconfiguration block with the proper directory path.
3. Once installed, the MCP server exposes a variety of tools to the AI client:
- Click-Tool: Clicks the mouse at specified screen coordinates.
- Type-Tool: Enters text, with an option to clear the field first.
- Clipboard-Tool: Performs copy and paste actions using the system clipboard.
- Scroll-Tool: Scrolls a window or a specific area vertically or horizontally.
- Drag-Tool: Drags the mouse from a starting point to an ending point.
- Move-Tool: Moves the mouse pointer to a specific location on the screen.
- Shortcut-Tool: Executes keyboard shortcuts like
Ctrl+CorAlt+Tab. - Key-Tool: Presses a single keyboard key.
- Wait-Tool: Pauses execution for a specified amount of time.
- State-Tool: Captures a comprehensive snapshot of the desktop, including active apps, UI elements, and a screenshot.
- Resize-Tool: Changes the size or position of an application window.
- Launch-Tool: Opens an application from the start menu.
- Shell-Tool: Executes PowerShell commands.
- Scrape-Tool: Extracts all textual information from a webpage.
FAQs
Q: Does this work on non-English versions of Windows?
A: It’s highly recommended to use English as the default language. Some tools like Launch-Tool and Switch-Tool might not work as expected with other languages unless you disable them in the server configuration.
Q: What are the main limitations right now?
A: Currently, it has trouble selecting specific text within a paragraph because it relies on the accessibility tree. Also, the Type-Tool isn’t ideal for writing code in an IDE as it pastes the entire block at once. And no, you can’t use it to play video games.
Q: Is it secure to let an AI control my desktop?
A: You should exercise caution. The tool gives an AI agent direct control over your system to perform actions. It’s best to use it in controlled environments and avoid tasks where mistakes could cause problems.
Q: Can I create my own custom tools?
A: Yes. You can look at the existing tools as a reference and build your own to suit your specific automation needs.
Latest MCP Servers
CVE
WebMCP
webmcp is an MCP server that connects MCP clients to web search, page fetching, and local LLM-based extraction. It’s ideal…
Google Meta Ads GA4
Featured MCP Servers
Notion
Claude Peers
Excalidraw
FAQs
Q: What exactly is the Model Context Protocol (MCP)?
A: MCP is an open standard, like a common language, that lets AI applications (clients) and external data sources or tools (servers) talk to each other. It helps AI models get the context (data, instructions, tools) they need from outside systems to give more accurate and relevant responses. Think of it as a universal adapter for AI connections.
Q: How is MCP different from OpenAI's function calling or plugins?
A: While OpenAI's tools allow models to use specific external functions, MCP is a broader, open standard. It covers not just tool use, but also providing structured data (Resources) and instruction templates (Prompts) as context. Being an open standard means it's not tied to one company's models or platform. OpenAI has even started adopting MCP in its Agents SDK.
Q: Can I use MCP with frameworks like LangChain?
A: Yes, MCP is designed to complement frameworks like LangChain or LlamaIndex. Instead of relying solely on custom connectors within these frameworks, you can use MCP as a standardized bridge to connect to various tools and data sources. There's potential for interoperability, like converting MCP tools into LangChain tools.
Q: Why was MCP created? What problem does it solve?
A: It was created because large language models often lack real-time information and connecting them to external data/tools required custom, complex integrations for each pair. MCP solves this by providing a standard way to connect, reducing development time, complexity, and cost, and enabling better interoperability between different AI models and tools.
Q: Is MCP secure? What are the main risks?
A: Security is a major consideration. While MCP includes principles like user consent and control, risks exist. These include potential server compromises leading to token theft, indirect prompt injection attacks, excessive permissions, context data leakage, session hijacking, and vulnerabilities in server implementations. Implementing robust security measures like OAuth 2.1, TLS, strict permissions, and monitoring is crucial.
Q: Who is behind MCP?
A: MCP was initially developed and open-sourced by Anthropic. However, it's an open standard with active contributions from the community, including companies like Microsoft and VMware Tanzu who maintain official SDKs.



