Deployment

Docker image, docker-compose stack, systemd unit, incident runbooks, BCP template.

Production-oriented configuration for running Horizon as a service. The deploy/ directory in the repo holds these files; they are not imported by the package. Copy, adapt, ship.

What’s in deploy/

PathPurpose
DockerfileMulti-stage Python 3.12 image
docker-compose.ymlHorizon + Prometheus + Grafana stack
prometheus.ymlScrape config for horizon:9100
systemd/horizon.serviceSystemd unit for bare-metal or VM deployments
runbooks/kill-switch.mdIncident playbook: kill switch fired
runbooks/feed-stale.mdIncident playbook: feed stopped ticking
runbooks/credential-rotation.mdRotating broker credentials
runbooks/disaster-recovery.mdFull host-loss recovery
bcp.mdBusiness Continuity Plan template (Rule 206(4)-4)
.env.exampleTemplate for deployment secrets

Container

sh
docker build -f deploy/Dockerfile --build-arg HORIZON_EXTRAS=equity,options \
    -t horizon:prod .

HORIZON_EXTRAS maps to pip install horizon[...]. Pick the subset you need. Empty = core only.

The image:

  • Python 3.12 slim base.
  • Builder stage: compiles and resolves optional deps.
  • Runtime stage: no toolchain, just the venv + the horizon package.
  • Runs as non-root horizon user (uid 1000, gid 1000).
  • Expects /state mounted from the host for the audit log, DLQ, and any SQLite files.
  • Reads credentials from env vars (fed by docker-compose or the systemd unit).

docker-compose stack

sh
cd deploy
cp .env.example .env           # fill in credentials
docker-compose up -d

Services:

  • horizon: the trading daemon, exposes /metrics on port 9100.
  • prometheus: scrapes horizon:9100 with 30-day retention.
  • grafana: dashboards on port 3000 (change the admin password).

Mount ./state for durable storage:

  • audit.db
  • audit_archive.db
  • dlq.db
  • destruction.jsonl
  • any preset files

The healthcheck on the horizon service runs horizon version every 30s; docker restarts the container on three consecutive failures.

systemd

For non-container deployments:

sh
sudo cp deploy/systemd/horizon.service /etc/systemd/system/
sudo systemctl daemon-reload
sudo systemctl enable --now horizon.service

The unit hardens the service:

  • NoNewPrivileges
  • PrivateTmp
  • ProtectSystem=strict
  • ProtectHome
  • Empty CapabilityBoundingSet
  • Write access only to /var/lib/horizon and /var/log/horizon

Put credentials in /etc/horizon/credentials.env (mode 0600, owned by horizon:horizon).

Runbooks

Four playbooks ship. Wire them into the alerting bridge so an on-call page links straight to the right one.

  • Kill switch fired: acknowledge, identify root cause, decide halt or restart.
  • Feed stale: confirm connectivity, decide manual unwind or auto-reconnect.
  • Credential rotation: staged rotation with dry-run verify + revocation.
  • Disaster recovery: full host-loss restore from backup.

Business Continuity Plan

deploy/bcp.md is a skeleton compliant with Advisers Act Rule 206(4)-4. Counsel signs off on the final.

Structure:

  1. Governance (owner, sponsor, review cycle)
  2. Scope (covered systems)
  3. Risk assessment (scenario + impact + mitigation table)
  4. Recovery objectives (RTO / RPO)
  5. Procedures (linked to the runbooks)
  6. Communication plan (internal, clients, SEC, custodian)
  7. Data retention (links to Retention)
  8. Vendor management
  9. Testing (annual drill, quarterly kill-switch test, monthly audit verify)
  10. Sign-off

What’s intentionally missing

  • No embedded TLS termination. Use a reverse proxy (Caddy, Traefik, Nginx) if you want to expose anything beyond /metrics.
  • No built-in HA. Two Horizon instances writing to the same SQLite sink will collide. For multi-writer, migrate to Postgres (planned).
  • No embedded Vault. Secrets come from the environment. The docker-compose file points at an .env file; production should read from Vault or AWS Secrets Manager directly via the Secrets Protocol.
  • No K8s manifests. Trivial to derive from the Dockerfile + envs; beyond the scope of this template.

Related

  • Metrics for Prometheus scrape config.
  • Alerting for PagerDuty / Slack / email / Twilio integrations.
  • Retention for audit enforcement.
  • Tracing for the OTel collector sidecar.
  • Recovery for crash-recovery semantics.