Sophon Docs
Security

Sandbox & Code Execution IsolationNEW

How Sophon isolates code execution and risky tools using containers and resource limits.

When an agent needs to run code — a Python data-crunch, a generated C# snippet, a shell command from a marketplace skill — Sophon does not run it directly on the host. Untrusted and generated code goes into an isolated Docker container with resource limits, network policy, and automatic cleanup. The same ISandboxOrchestrator powers both ad-hoc code.execute calls and sandboxed skills.

This page explains the isolation model and contrasts it with running directly on the host.

The three execution tools

Sophon exposes execution through three distinct tools, each with a different risk level. Risk level drives approval gates.

ToolWhat it runsWhereRisk
code.executePython or C# codeSandboxed containerHigh
os.execute_sandboxedA shell command, wrapped and run inside the sandboxSandboxed containerHigh
os.executeA shell command (bash, sh, cmd, powershell)Directly on the host OSCritical

The first two are gated at High — they pause and ask a human before running. os.execute is Critical: it touches the real machine with full host privileges, so it always requires explicit approval and confirmation. Prefer the sandboxed path whenever possible; os.execute_sandboxed even tells the model to fall back to os.execute only when Docker is unavailable.

os.execute runs with the same privileges as the Sophon process. Treat every approval for it as you would handing someone a terminal on your server.

What "sandboxed" means

The Docker-based orchestrator (DockerSandboxOrchestrator) creates a fresh container per execution:

  • Languages — Python runs on a python:3.13-alpine image; C# runs on the .NET 10 SDK Alpine image via dotnet run. A minimal project file and an offline NuGet config are generated for C# so restores stay local.
  • Network policy — the container's network mode is none (fully isolated) unless network access is explicitly required, in which case it switches to bridge. Python that imports third-party packages triggers a pip install, which enables network for that run.
  • Resource limits — memory cap, CPU quota, and a process-count limit are applied to every container, alongside a hard wall-clock timeout that kills the container if exceeded.
  • Cleanup — the per-run workspace directory and the container are removed afterward, success or failure.
  • gVisor — where configured, containers run under the gVisor runtime, a user-space kernel that intercepts syscalls so sandboxed code never talks directly to the host kernel.

Limits and knobs

A sandbox run is described by a request object. These are the defaults a single code.execute call uses:

KnobDefaultNotes
Timeout30 secondsContainer is killed on expiry
MemoryLimitBytes256 MBHard memory cap
CpuLimitPercent50Fraction of one CPU
NetworkEnabledoffnone network mode until enabled
NetworkAllowListemptyOptional outbound host allowlist
PythonPackagesnoneAuto-resolved from imports; enables network for pip

Skills can declare their own limits through a security policy. The built-in presets give a sense of the range — for example the standard preset is 512 MB / 1.0 CPU / 256 PIDs / 300 s with no network and a read-only root filesystem, while minimal drops to 128 MB / 0.25 CPU / 30 s and locked-down runs as nobody with a seccomp profile, 256 MB, and a 60 s timeout. Network-enabled presets either open outbound traffic or restrict it to an explicit host allowlist.

Container security policies can also drop all Linux capabilities, mount a read-only root filesystem with only specific writable paths (/workspace, /tmp), and apply a restrictive seccomp profile. The exact policy depends on the preset or per-skill manifest in effect.

Sandboxed vs host execution

Sandboxed (Docker)Host (os.execute)
FilesystemIsolated workspace onlyFull host filesystem
Networknone by default, opt-in bridgeWhatever the host has
Resource limitsMemory, CPU, PIDs, timeoutTimeout only
PrivilegesReduced, optionally gVisor-isolatedSame as the Sophon process
Risk levelHighCritical

Process fallback

If Docker is not available on the host, the orchestrator can fall back to a process-based runner that executes the code as a child process with a temp workspace and a timeout. This keeps trusted bundled skills working, but it provides reduced isolation — there is no container, no filesystem boundary beyond the temp directory, and the process inherits host network access. Untrusted or network-isolated code should always run under the Docker sandbox.

Where to go next