Sophon Docs
Sophon Node

Sophon Node — Overview

A cross-platform desktop automation worker that lets Sophon control real screens, keyboards, and applications on your PCs.

Sophon Node is a lightweight .NET 10 background service that turns any Windows, macOS, or Linux machine into a remotely controllable AI agent endpoint. It pairs with your Gateway, waits for commands over SignalR, and executes them locally: capture screens, move the mouse, type text, launch apps, manage windows, read/write the clipboard, run shell commands.

The AI reasoning happens on the Gateway. The Node is a dumb executor — it receives a command, runs it, returns the result. That separation keeps the security model simple: the Node never decides what to do.

When to use Sophon Node

  • Automating a desktop app that has no API — a legacy accounting tool, a vendor portal that insists on browser plugins, an Excel macro.
  • "Please book me a flight" flows that need to click through a real airline site with saved state.
  • Remote-work assistants — you're away from your desk; your Sophon agent can click through a confirmation dialog on your home PC.
  • Multi-machine orchestration — one Gateway coordinating several Nodes across your home / office / lab.
  • Accessibility workflows — voice-driven desktop control for users who can't easily use a keyboard or mouse.

Don't use Sophon Node when:

  • A web API exists — use a connection or MCP server instead. APIs are faster and more reliable than GUI automation.
  • You want fully-unattended server automation — Nodes are designed for user desktops, not headless servers.

Architecture

Dashboard / Agent / CLI


  Sophon Gateway
  (NodeHub + NodeRuntimeService)
        │   SignalR (/hubs/node)

  Sophon Node (Worker Service)

   IPlatformAutomation
   /        |        \
Windows   macOS    Linux
P/Invoke  AppleScript xdotool / ydotool

Key points:

  • Standalone binary. The Node is a self-contained .NET 10 executable with zero references to Sophon.Core. It ships as a single file, no .NET runtime required on the target.
  • Bidirectional SignalR. The primary transport. If WebSockets fail, the Node falls back to HTTP polling (GET /api/nodes/me/commands).
  • Scope-gated commands. Every command type maps to a permission scope. The Gateway checks scopes before dispatching. The wildcard scope node.command grants everything.
  • Platform abstraction. IPlatformAutomation has 22 methods; each OS implements them differently (Windows uses P/Invoke into user32/gdi32; macOS uses AppleScript + osascript + cliclick + screencapture; Linux uses xdotool / ydotool + scrot / grim + xclip / wl-copy).
  • 30-second heartbeat. The Node pings the Gateway every 30 seconds. If heartbeats stop, the Node shows up as Offline in the Dashboard.

Lifecycle

  1. Register a node in the Dashboard → get pairing credentials.
  2. Pair — run sophon-node pair on the target machine. Pairing is interactive: the Node polls the Gateway every 3 seconds until you approve the pairing from the Dashboard.
  3. Approve in the Dashboard, set permission scopes.
  4. Startsophon-node start (interactive) or sophon-node start --service (headless). Node connects, sends heartbeats, executes commands.
  5. Invoke — the agent calls node.command({nodeId, command, params}). Gateway checks scopes, dispatches via SignalR.
  6. Unpair when you're done — sophon-node unpair deletes local state, and the Dashboard can revoke the node's token.

Status

  • GA. Full test coverage (unit + integration), stable CLI, shipping as a supported product.
  • Agent tool: node.command (built-in).
  • Windows has a system-tray icon with pause / resume / quit actions. macOS / Linux run headless today.
  • Service integration: Windows Services Manager (SC), Linux systemd, macOS launchd.

Security model

Three layers:

  1. Pairing approval — a Node can't do anything until a Dashboard admin explicitly approves its pairing. Unpaired Nodes are rejected at the SignalR connection layer.
  2. Permission scopes — per-Node scope configuration limits what commands can be dispatched. Even the wildcard node.command scope doesn't exempt individual commands from their own risk-level gates (see next).
  3. Approval gates on Critical commandssystem.execute (shell commands) is rated Critical and always requires per-call approval regardless of scope configuration. No override.

Tokens are stored in plaintext at ~/.sophon-node/config.json today. Future releases will use OS keyrings (DPAPI on Windows, Keychain on macOS, libsecret on Linux).

Quick start

# On target machine
sophon-node pair --gateway https://gw.example.com --node-id <ID> --secret <SECRET>
# Dashboard: approve the pending node, set scopes
sophon-node start

The agent can now issue commands:

node.command({
  nodeId: "my-office-pc",
  command: "screen.capture",
  params: { quality: 80, maxWidth: 1280 }
})

Where to go next