Sophon Node — Overview
A cross-platform desktop automation worker that lets Sophon control real screens, keyboards, and applications on your PCs.
Sophon Node is a lightweight .NET 10 background service that turns any Windows, macOS, or Linux machine into a remotely controllable AI agent endpoint. It pairs with your Gateway, waits for commands over SignalR, and executes them locally: capture screens, move the mouse, type text, launch apps, manage windows, read/write the clipboard, run shell commands.
The AI reasoning happens on the Gateway. The Node is a dumb executor — it receives a command, runs it, returns the result. That separation keeps the security model simple: the Node never decides what to do.
When to use Sophon Node
- Automating a desktop app that has no API — a legacy accounting tool, a vendor portal that insists on browser plugins, an Excel macro.
- "Please book me a flight" flows that need to click through a real airline site with saved state.
- Remote-work assistants — you're away from your desk; your Sophon agent can click through a confirmation dialog on your home PC.
- Multi-machine orchestration — one Gateway coordinating several Nodes across your home / office / lab.
- Accessibility workflows — voice-driven desktop control for users who can't easily use a keyboard or mouse.
Don't use Sophon Node when:
- A web API exists — use a connection or MCP server instead. APIs are faster and more reliable than GUI automation.
- You want fully-unattended server automation — Nodes are designed for user desktops, not headless servers.
Architecture
Dashboard / Agent / CLI
│
▼
Sophon Gateway
(NodeHub + NodeRuntimeService)
│ SignalR (/hubs/node)
▼
Sophon Node (Worker Service)
│
IPlatformAutomation
/ | \
Windows macOS Linux
P/Invoke AppleScript xdotool / ydotoolKey points:
- Standalone binary. The Node is a self-contained .NET 10 executable with zero references to Sophon.Core. It ships as a single file, no .NET runtime required on the target.
- Bidirectional SignalR. The primary transport. If WebSockets fail, the Node falls back to HTTP polling (
GET /api/nodes/me/commands). - Scope-gated commands. Every command type maps to a permission scope. The Gateway checks scopes before dispatching. The wildcard scope
node.commandgrants everything. - Platform abstraction.
IPlatformAutomationhas 22 methods; each OS implements them differently (Windows uses P/Invoke into user32/gdi32; macOS uses AppleScript +osascript+cliclick+screencapture; Linux usesxdotool/ydotool+scrot/grim+xclip/wl-copy). - 30-second heartbeat. The Node pings the Gateway every 30 seconds. If heartbeats stop, the Node shows up as Offline in the Dashboard.
Lifecycle
- Register a node in the Dashboard → get pairing credentials.
- Pair — run
sophon-node pairon the target machine. Pairing is interactive: the Node polls the Gateway every 3 seconds until you approve the pairing from the Dashboard. - Approve in the Dashboard, set permission scopes.
- Start —
sophon-node start(interactive) orsophon-node start --service(headless). Node connects, sends heartbeats, executes commands. - Invoke — the agent calls
node.command({nodeId, command, params}). Gateway checks scopes, dispatches via SignalR. - Unpair when you're done —
sophon-node unpairdeletes local state, and the Dashboard can revoke the node's token.
Status
- GA. Full test coverage (unit + integration), stable CLI, shipping as a supported product.
- Agent tool:
node.command(built-in). - Windows has a system-tray icon with pause / resume / quit actions. macOS / Linux run headless today.
- Service integration: Windows Services Manager (SC), Linux systemd, macOS launchd.
Security model
Three layers:
- Pairing approval — a Node can't do anything until a Dashboard admin explicitly approves its pairing. Unpaired Nodes are rejected at the SignalR connection layer.
- Permission scopes — per-Node scope configuration limits what commands can be dispatched. Even the wildcard
node.commandscope doesn't exempt individual commands from their own risk-level gates (see next). - Approval gates on Critical commands —
system.execute(shell commands) is rated Critical and always requires per-call approval regardless of scope configuration. No override.
Tokens are stored in plaintext at ~/.sophon-node/config.json today. Future releases will use OS keyrings (DPAPI on Windows, Keychain on macOS, libsecret on Linux).
Quick start
# On target machine
sophon-node pair --gateway https://gw.example.com --node-id <ID> --secret <SECRET>
# Dashboard: approve the pending node, set scopes
sophon-node startThe agent can now issue commands:
node.command({
nodeId: "my-office-pc",
command: "screen.capture",
params: { quality: 80, maxWidth: 1280 }
})Where to go next
- Install & Pair a Node — step-by-step
- Permissions & Scopes — what each scope grants
- Available Commands — the 23 commands with parameters
- Running as a Service — systemd, launchd, Windows SC