Harusame64/desktop-touch-mcp
🔌 MCP ServerHarusame64
Advanced Windows desktop automation for AI agents using entity-based interaction to minimize UI failure rates.
desktop-touch-mcp bridges the gap between LLM reasoning and Windows desktop environments. By moving beyond simple X/Y coordinate clicking, this tool allows agents to perceive and manipulate UI elements as distinct entities. It integrates a comprehensive suite of capabilities, including UIA for element inspection, CDP for browser-based automation, and direct system controls for keyboard, mouse, clipboard, and terminal operations.
Key innovations include 'entity leases' to maintain state consistency, 'verified delivery' to ensure actions are successfully registered by the OS, and 'causal context' to help agents understand the sequence of operations. This architecture is designed to solve the 'brittleness' problem in desktop automation, where minor UI shifts cause traditional scripts to fail. With its interaction memory, the agent can track past UI states, making it a powerful tool for building autonomous agents capable of complex, multi-step desktop workflows.
💡Highlights
- ├─Entity-based UI interaction
- ├─UIA and CDP integration
- └─Verified delivery & state memory
🎯For
- ├─AI Agent Developers
- └─Automation Engineers