iPhone
The iPhone MCP Server is built to automate tasks on an iPhone with Appium. It lets your code or AI agent see and interact with an iPhone’s screen.
The MCP server handles everything from controlling apps and tapping buttons to capturing the screen, all through a simple HTTP interface.
You can use it for building out automated tests for your iOS app or even create an AI assistant that can operate a phone on its own.
Features
- 📱 Device Info: Pulls a full list of installed apps and other device details.
- 📸 Screen & UI Capture: Grabs a screenshot and provides an optimized XML layout of the screen, which is great for keeping AI token usage down.
- 👆 Full UI Interaction: Lists on-screen elements and lets you perform taps, swipes, and text input.
- 🚀 Application Control: Launch any app using its bundle ID or check which app is currently in the foreground.
Use Cases
- Automated QA Cycles: Instead of manually tapping through your app every time you push a new build, you can script the entire process. It’s perfect for regression testing. Just write a script to check all your core features and run it automatically to catch any new bugs.
- Building AI Agents: This is the piece you need to connect a large language model to a real device. The model can receive the screen’s XML snapshot, understand the UI, and send back commands like
iphone_operate_clickto interact with the app, just like a person would. - Repetitive Task Automation: If you have a task that requires you to open an app and perform the same sequence of taps and inputs every day, you can automate it. This could be anything from logging information to checking for updates in an app that doesn’t have an API.
How to Use It
1. Clone the code from the Github repository.
git clone https://github.com/Lakr233/iphone-mcp.git && cd iphone-mcp2. Set up and activate a Python virtual environment to keep dependencies clean.
python -m venv .venv && source .venv/bin/activate3. Install the necessary Python packages.
pip install -r requirements.txt4. Install Appium and its XCUITest driver.
npm install -g appium && appium driver install xcuitest5. Configure the WebDriver Agent, which is what Appium uses to communicate with the iPhone.
6. Tell the MCP server which device to use. Open the start.sh file and put your device’s UDID in the DEVICE_UDID variable.
7. To get everything running, just execute the shell script.
./start.shThis command starts both the Appium instance and the MCP server. By default, you can reach the server at http://127.0.0.1:8765/mcp.
8. The server exposes several functions for controlling the device:
iphone_device_info: Gets hardware details.iphone_device_apps: Lists all installed applications.iphone_interface_snapshot: Returns a screenshot and the UI’s XML structure.iphone_interface_elements: Lists the interactive UI elements on the current screen.iphone_operate_click: Performs a tap action.iphone_operate_swipe: Executes a swipe gesture.iphone_operate_text_input: Enters text into a field.iphone_operate_app_launch: Opens an app via its bundle ID.iphone_operate_get_current_bundle_id: Fetches the bundle ID of the active app.
9. You can adjust the server’s behavior by setting environment variables inside the start.sh script. The required DEVICE_UDID must be set, but you can also change the host, port, and log level for both the Appium and MCP servers.
export APPIUM_HOST=${APPIUM_HOST:-127.0.0.1}
export APPIUM_PORT=${APPIUM_PORT:-4723}
export PLATFORM_NAME=${PLATFORM_NAME:-iOS}
export AUTOMATION_NAME=${AUTOMATION_NAME:-XCUITest}
export WDA_LOCAL_PORT=${WDA_LOCAL_PORT:-8100}
export DEFAULT_LAUNCH_TIMEOUT=${DEFAULT_LAUNCH_TIMEOUT:-30000}
export NEW_COMMAND_TIMEOUT=${NEW_COMMAND_TIMEOUT:-360}
export SERVER_HOST=${SERVER_HOST:-127.0.0.1}
export SERVER_PORT=${SERVER_PORT:-8765}
export SERVER_PATH=${SERVER_PATH:-/mcp}
export DEVICE_UDID=${DEVICE_UDID:-""}
export DEFAULT_BUNDLE_ID=${DEFAULT_BUNDLE_ID:-com.apple.Preferences}
export LOG_LEVEL=${LOG_LEVEL:-INFO}FAQs
Q: What if I have another service using port 8765?
A: You can change the port. Open the start.sh file and set the SERVER_PORT environment variable to an open port of your choice. You’ll do the same for APPIUM_PORT if Appium’s default port is taken.
Q: Does this work with iOS simulators?
A: The setup is focused on a physical iPhone with a UDID. While Appium and WebDriverAgent support simulators, this project’s scripts are configured for a real device out of the box. You would need to adjust the Appium capabilities in the configuration for a simulator.
Q: The XML output is still too large. How can I reduce it?
A: The XML is already optimized, but its size depends on the complexity of the app’s UI. For AI agent use, you could programmatically process the XML before sending it to the model, stripping out elements that aren’t relevant to the agent’s current task.
Latest MCP Servers
Notion
Log Mcp
Apple
Featured MCP Servers
Notion
Claude Peers
Excalidraw
FAQs
Q: What exactly is the Model Context Protocol (MCP)?
A: MCP is an open standard, like a common language, that lets AI applications (clients) and external data sources or tools (servers) talk to each other. It helps AI models get the context (data, instructions, tools) they need from outside systems to give more accurate and relevant responses. Think of it as a universal adapter for AI connections.
Q: How is MCP different from OpenAI's function calling or plugins?
A: While OpenAI's tools allow models to use specific external functions, MCP is a broader, open standard. It covers not just tool use, but also providing structured data (Resources) and instruction templates (Prompts) as context. Being an open standard means it's not tied to one company's models or platform. OpenAI has even started adopting MCP in its Agents SDK.
Q: Can I use MCP with frameworks like LangChain?
A: Yes, MCP is designed to complement frameworks like LangChain or LlamaIndex. Instead of relying solely on custom connectors within these frameworks, you can use MCP as a standardized bridge to connect to various tools and data sources. There's potential for interoperability, like converting MCP tools into LangChain tools.
Q: Why was MCP created? What problem does it solve?
A: It was created because large language models often lack real-time information and connecting them to external data/tools required custom, complex integrations for each pair. MCP solves this by providing a standard way to connect, reducing development time, complexity, and cost, and enabling better interoperability between different AI models and tools.
Q: Is MCP secure? What are the main risks?
A: Security is a major consideration. While MCP includes principles like user consent and control, risks exist. These include potential server compromises leading to token theft, indirect prompt injection attacks, excessive permissions, context data leakage, session hijacking, and vulnerabilities in server implementations. Implementing robust security measures like OAuth 2.1, TLS, strict permissions, and monitoring is crucial.
Q: Who is behind MCP?
A: MCP was initially developed and open-sourced by Anthropic. However, it's an open standard with active contributions from the community, including companies like Microsoft and VMware Tanzu who maintain official SDKs.



