Tracing
OpenTelemetry spans across the submit path. NullTracer default; Otel on opt-in.
Tracing shows where time goes when an order is slow, or which gate rejected a specific action. horizon.observability.tracing provides a narrow Tracer Protocol, a NullTracer default, and OpenTelemetryTracer for production deployments.
Protocol
class Tracer(Protocol):
def span(self, name: str, **attrs: Any) -> Any: ...
span(...) returns a context manager. Attributes are recorded on the span. Implementations never raise in steady state.
Null default
The default tracer is NullTracer. All Tier 1 to 3 code paths pay zero overhead when tracing is not configured. hz.run() uses NullTracer unless tracer= is passed.
OpenTelemetry
from horizon.observability import OpenTelemetryTracer
import horizon as hz
tracer = OpenTelemetryTracer(service_name="horizon-prod")
hz.run(mode="live", feed=..., tracer=tracer, ...)
OpenTelemetryTracer lazy-imports opentelemetry-api and opentelemetry-sdk. Install with:
pip install opentelemetry-api opentelemetry-sdk opentelemetry-exporter-otlp
It picks an exporter automatically:
opentelemetry-exporter-otlpinstalled: OTLP over gRPC tolocalhost:4317.- Otherwise: stdout console exporter.
Override with constructor arguments:
tracer = OpenTelemetryTracer(
service_name="horizon-prod",
exporter="otlp",
otlp_endpoint="https://my-collector.example:4317",
)
Or pass a pre-built TracerProvider:
from opentelemetry.sdk.trace import TracerProvider
provider = TracerProvider(...)
tracer = OpenTelemetryTracer(tracer_provider=provider)
What spans open automatically
| Span name | Where | Attributes |
|---|---|---|
venue.submit | Around every broker submit() call | venue, account, side, market_id, client_order_id |
The submit span is the one most incident responders need. Every fill reconciliation, latency question, and broker reject traces back to it.
Manual spans
Strategies, features, and risk checks can open spans manually. The tracer is available wherever hz.run(...) is configured with one:
class MyStrategy(Strategy):
def on_tick(self, ctx):
with ctx.tracer.span("compute_signal", market_id=ctx.market_id):
return self._compute(ctx)
Or from any module that holds a reference to the tracer:
with tracer.span("reconcile", venue="alpaca"):
reconciler.reconcile()
Attribute hygiene
Keep attribute keys consistent. Standard names in the Horizon code:
venue(notvenue_name)account(notaccount_id)market_idsideclient_order_idcorrelation_id(threaded viahorizon.observability.logging)
The OTel semantic conventions prefer dotted names (account.id). Pick one convention per deployment; the tracer stores whatever is passed.
Exception recording
When code inside a span raises, the tracer records the exception on the span and sets the status to ERROR before re-raising. This keeps observed latency accurate and surfaces failures in the trace UI without swallowing the original error.
Shutdown
The tracer exposes shutdown() for flushing buffered spans at process exit. When running under hz.run(mode="live"), call it in your stop handler:
try:
hz.run(..., tracer=tracer, ...)
finally:
tracer.shutdown()
The run loop does not call shutdown on a caller-provided tracer; that is the caller’s responsibility so a shared tracer survives across hz.run invocations.
Relationship to metrics and audit
- Metrics count things (orders submitted, errors per minute) for dashboards and SLO alerts. See Metrics.
- Audit events record what happened, in full, forever. See Audit trail.
- Traces show where time goes within a single operation, for incident triage.
Use all three. They answer different questions.
Out of scope
- Sampling config. Use the standard OTel
OTEL_TRACES_SAMPLERenv vars. - Propagation across processes. OTel’s context propagation (W3C TraceContext) works; the Horizon code does not strip it.
- Metrics via OTel. Prometheus is the shipped metrics backend (see Metrics). You can layer OTel metrics on top if you prefer.
- Hot-path span density. The submit path is instrumented. Adding spans inside every feature / strategy step is the caller’s call; the tracer makes it cheap when
NullTraceris active.
Related
- Metrics.
- Audit trail.
- Deployment for the OTel collector sidecar.