The rule
If an AI agent can run shell commands or open network connections on your behalf — Claude Code, Cursor agent mode, Aider, OpenInterpreter, or self-hosted ones like Nous Research's Hermes — put it in a container. Sandbox modes and "ask before running" prompts help, but they're trust boundaries inside the same process you're trying to defend. A container is a real one.
Worst cases you're defending against:
- The agent misreads a prompt and runs something destructive (
rm -rfon the wrong path, force-push to main, drop a database). - A prompt-injected README, dependency, or tool output convinces the agent to
exfiltrate your shell history,
.envfiles, or 1Password CLI tokens. - The agent installs a compromised package (see Shai-Hulud) and the post-install script runs as you.
In a container, the worst it can touch is the container's filesystem, whatever you explicitly bind-mounted in, and the network rules you let it have.
The shape of it
The compose file is the security policy. The things to fight for in every service block:
- A dedicated bridge network that the agent containers share with each
other but not with the host. Internal hostnames (
http://hermes:8642) replacelocalhost, which means nothing on the host can pretend to be the gateway. - Bind-mount only what's needed — a single data dir, the project, a
scoped
.env. No$HOME, no~/.ssh, no~/.aws. - Secrets via
env_fileor${VAR:-}references — never hard-coded. Easy to rotate, easy to omit accidentally (which is the safe default). deploy.resources.limits— a runaway agent can't eat all your RAM or pin every core if you cap it. Cheap insurance.restartpolicy:unless-stoppedfor long-running self-hosted agents (gateway, dashboard);nofor short-lived per-project shells where a crash should mean "investigate," not "restart."- Bind to
0.0.0.0only when you understand the consequences. If you do, set a session password (the Hermes example does this withHERMES_PASSWORD) and put the whole thing behind Tailscale Funnel or a Cloudflare Tunnel for TLS + auth.
A real example: Nous Research Hermes
This is a working multi-service compose for the Hermes agent — gateway, dashboard, and the workspace UI — sharing one private bridge network:
services:
hermes:
image: nousresearch/hermes-agent:latest
restart: unless-stopped
command: gateway run
ports:
- "8643:8642"
volumes:
- ./hermes:/opt/data
networks:
- hermes-net
# healthcheck:
# test: ['CMD-SHELL', 'curl -fsS http://localhost:8642/health || exit 1']
# interval: 10s
# timeout: 5s
# retries: 5
# start_period: 15s
# Uncomment to forward specific env vars instead of using .env file:
# environment:
# - ANTHROPIC_API_KEY=${ANTHROPIC_API_KEY}
# - OPENAI_API_KEY=${OPENAI_API_KEY}
# - TELEGRAM_BOT_TOKEN=${TELEGRAM_BOT_TOKEN}
deploy:
resources:
limits:
memory: 4G
cpus: "2.0"
dashboard:
image: nousresearch/hermes-agent:latest
restart: unless-stopped
command: dashboard --host 0.0.0.0 --insecure
ports:
- "9120:9119"
volumes:
- ./hermes:/opt/data
environment:
- GATEWAY_HEALTH_URL=http://hermes:8642
networks:
- hermes-net
depends_on:
- hermes
deploy:
resources:
limits:
memory: 512M
cpus: "0.5"
hermes-workspace:
image: ghcr.io/outsourc-e/hermes-workspace:latest
depends_on:
- hermes
env_file:
- ./hermes/.env
networks:
- hermes-net
environment:
# Internal Docker network URL (not localhost!)
HERMES_API_URL: http://hermes:8642
# Must match API_SERVER_KEY on the hermes-agent side when that is set
HERMES_API_TOKEN: ${API_SERVER_KEY:-}
# Workspace session password. REQUIRED when HOST is non-loopback (the
# default for Docker images, so the container binds 0.0.0.0:3000).
# Pick a strong secret. See #122.
HERMES_PASSWORD: ${HERMES_PASSWORD:-}
# Enable the Secure flag on session cookies when terminated behind
# HTTPS (reverse proxy / Tailscale Funnel / Cloudflare Tunnel). See #123.
COOKIE_SECURE: ${COOKIE_SECURE:-}
# Trust proxy-forwarded headers (x-forwarded-for / x-real-ip) for IP
# classification. Leave unset unless you deploy behind a trusted proxy
# that sanitizes these headers — otherwise a client can spoof its IP
# and bypass local-classification / rate limiting. See #125.
TRUST_PROXY: ${TRUST_PROXY:-}
HERMES_ALLOW_INSECURE_REMOTE: "1"
ports:
- '0.0.0.0:3000:3000'
networks:
hermes-net:
driver: bridge
What's worth copying from this pattern even if you're not running Hermes:
hermes-netis a private bridge. The three services talk to each other by name (http://hermes:8642), and the host only sees the ports that are explicitly published. The dashboard'sGATEWAY_HEALTH_URLuses the internal name on purpose — if you wrotehttp://localhost:8642, any process on the host could impersonate the gateway.env_file: ./hermes/.envkeeps secrets out of the compose file. Pair withop run --env-file=...(or the 1Password CLI directly) so the file on disk only holds vault references, not actual tokens.HERMES_PASSWORDexists because the workspace binds0.0.0.0:3000. Any time you expose an agent UI on a non-loopback address, you're one unintended port-forward (or VPN) away from giving a stranger an agent with your API keys. Either set a session secret + put it behind a tunnel, or bind to127.0.0.1and reach it via SSH local forward.- Resource limits are not optional. A 4 GB ceiling on the gateway and half a gig on the dashboard means a runaway loop is a degraded service, not a frozen laptop.
Run it
# .env only contains references, not values
op run --env-file=./hermes/.env -- docker compose up -d
docker compose ps
docker compose logs -f hermes
When something feels wrong, the undo button is one command:
docker compose down -v # also drops the named volumes
Things people get wrong
- Mounting
$HOME"just in case" — defeats the entire exercise. Bind only the specific dirs the agent needs (./hermes, your project root). network_mode: host— only set this if you need to hit a local service likelocalhost:5432. The bridge is enough and far tighter.- Pulling
latesttags — fine for a personal lab, dangerous for anything you care about. Pin to a digest (nousresearch/hermes-agent@sha256:...) or a specific tag so a poisoned upstream can't change what you run between sessions. - Skipping a non-root user — many images run as root by default. If the
agent runs as root and you've mounted a project read-write, it can chown
host files to root. Override with
user: "1000:1000"(or whatever maps to your host user) when the image allows it. - Trusting
x-forwarded-forwithout a proxy in front — see theTRUST_PROXYcomment above. A header you don't sanitize is a header someone can spoof.
When you can't use Docker
- Performance-sensitive local toolchains (XCode, native iOS builds) — the container overhead is real. For these, lean harder on the agent's own sandbox mode and a separate macOS user account.
- Locked-down work laptops without Docker access — at least scope the agent to a dedicated directory and disable any tools that touch your shell config, ssh keys, or cloud creds.
TL;DR
- Every agent runs in a container, on a private bridge network.
- Bind-mount only what the agent needs. Nothing from
$HOME. - Secrets via
env_file+ vault references, never inline plaintext. - Set
deploy.resources.limits. Pin image tags. Use non-root where possible. - If you bind to
0.0.0.0, set a session password and put it behind a tunnel — or just bind to loopback. docker compose down -vis your undo button.