Flow Surveillance

Counterparty bot detection, manipulation surveillance, and policy reverse-engineering. Opt-in module that sits on top of the core SDK without modifying it.

horizon.flow is a separate opt-in module for observing OTHER participants in a venue. Detecting bots, flagging manipulation patterns (spoofing, layering, quote-stuffing, wash trading, momentum ignition, iceberg orders), clustering coordinated wallets, and reverse-engineering the policies of observable traders.

It is NOT a replacement for horizon.compliance, which polices your own firm for PDT, suitability, restrictions, and pre-clearance. Flow is about the market; compliance is about you.

Start here: I want to…

Pick the page that matches what you’re trying to do:

Try it for the first time → Quickstart. Zero to a detected spoof in 15 minutes, no live data needed.
Understand what’s happening inside → How it works. Mental model, event flow, the seven steps that run on every event.
Protect my strategy from toxic flow → Defend recipes. Gate orders on findings, reduce size against HFT counterparties, blacklist clusters.
Use flow signals as alpha → Alpha recipes. Fade wash-pumps, widen spreads via Kirilenko probabilities, anticipate a reverse-engineered bot.
Investigate a wallet or market → Investigation recipes. Deep-dive reports, findings exports, policy fitting, cluster analysis.
Run this in production → Operations recipes. Live wiring, cron scans, Slack/PagerDuty routing, debugging, scale.
Look up a specific class or config → the reference pages below (events, actors, detectors, toxicity, clustering, hidden, policy, store, CLI, risk integration).

Why this exists

If you are trading on Polymarket, Kalshi, or Hyperliquid, you are trading against bots. On-chain and DEX venues expose counterparty wallets, which makes actor-level attribution feasible in a way equity markets do not allow. horizon.flow turns that on-chain visibility into structured findings a fiduciary can defend.

Three use cases:

Defense. Don’t trade against a known toxic counterparty. Pull an actor profile before sizing, or let FlowAnomalyCheck reject orders in markets with active spoofing or wash findings.
Alpha. Identify bots that react to predictable triggers; infer their shadow policy (decision-tree rules with SHAP feature attribution); anticipate their flow.
Record-keeping. Every finding emits into the existing hash-chained audit log and a queryable SQLiteFlowStore. A compliance reviewer or regulator can reconstruct what you knew and when.

V0.1 venue targets

The first release targets venues that expose counterparty identities:

Polymarket. CLOB WebSocket + Polygon RPC enrichment. Wallets visible per trade; on-chain tx hash, gas price, nonce, block number are all capturable.
Hyperliquid. L3 orderbook WebSocket exposes wallet per resting order and fill.
Kalshi. Public market data (tape + orderbook). No counterparty IDs here, but market-level signals (VPIN, OFI, quote-stuffing, iceberg) still work.

Equities, options, and perps beyond Hyperliquid land in v0.2 with graceful degradation. Tape-level signals only, no per-actor attribution.

Integration seams

horizon.flow plugs into the existing SDK at four seams, none of which require core modifications:

AuditLog subscription

`FlowObserver` subscribes to your existing `AuditLog` via `log.subscribe(observer)`. Every order lifecycle event flows through the normalization layer into a `MarketEvent`.

python

from horizon.audit import AuditLog, SQLiteSink
from horizon.flow import FlowObserver, make_default_engine

log = AuditLog(sink=SQLiteSink("audit.db"))
engine = make_default_engine(venue_name="polymarket", store_path="flow.db", audit_log=log)
log.subscribe(FlowObserver(venue_name="polymarket", on_event=engine.ingest))

LiveFeed tick handler

FlowFeedHandler subscribes to any LiveFeed via feed.on_tick(handler). Quote, trade, and orderbook ticks normalize into the same MarketEvent stream.

External venue ingestion

PolymarketFlowSource, KalshiFlowSource, HyperliquidFlowSource, and ReplayFlowSource connect directly to the venues' public feeds (CLOB WS, L3 book) and feed events into the engine.

Risk-pipeline rejection (optional)

FlowAnomalyCheck plugs into RiskConfig.extra_checks to reject orders in markets with active anomaly findings above a configured severity.

Findings emit to the existing AuditLog as new AuditCategory members (FlowAnomaly, ActorProfiled, ClusterAssigned, PolicyInferred, BotDetected), so they live in the same hash-chained regulatory record as orders and risk decisions.

Quick start

python

import horizon as hz
from horizon.flow import SQLiteFlowStore, set_default_store, make_default_engine
from horizon.flow.ingestion import ReplayFlowSource

# 1. One-time setup. Default engine comes wired with all 7 detectors,
# the actor feature extractor, Kirilenko taxonomy, and the store.
store = SQLiteFlowStore("flow.db")
set_default_store(store)
engine = make_default_engine(venue_name="polymarket", store_path="flow.db")

# 2. Feed events. For tests, replay a recorded JSONL fixture.
# For live, swap in PolymarketFlowSource / HyperliquidFlowSource.
source = ReplayFlowSource(path="polymarket_day.jsonl")
source.on_event(engine.ingest)
source.connect()

# 3. Query findings.
for finding in hz.flow.anomalies(market_id="0xTRUMP..."):
 print(finding.category.value, finding.confidence, finding.message)

# 4. Inspect an actor.
profile = hz.flow.actor_profile("0xabc...", venue="polymarket")
print(profile.taxonomy_probs) # {'hft': 0.72, 'opportunistic': 0.18, ...}
print(profile.features.hawkes_branching_ratio)

What’s in this section

The Flow Surveillance tab is organized into five sub-groups. Use the sidebar to jump between them; here’s the map.

Start here

Orientation and first-run material. Read in order if you’re new.

Page	Topic
Overview (this page)	What the module is, where it fits, who should use it.
Quickstart	Zero to a first finding in 15 minutes. Copy-pasteable tutorial.
How it works	Internal pipeline and the seven steps per event.
Roadmap	v0.1 through v1.0 phasing; what’s shipped, what’s next.

Recipes (I want to…)

Concrete worked examples per goal. Read whichever matches your task.

Page	Topic
Defend	Protect strategies from toxic counterparties. Five recipes.
Alpha	Use flow as an input signal, not just a filter. Five recipes.
Investigate	Forensic use of the flow store. Five recipes.
Operations	Production deployment, cron, alerting, debugging. Five recipes.

Core concepts

The data model plus the two main derived artifacts. Read when you need to understand what a particular record means.

Page	Topic
Data model	`MarketEvent`, `AnomalyFinding`, `ActorProfile`, `WalletCluster`, `PolicyModel`.
Actor profiling	`ActorFeatureExtractor`, Kirilenko taxonomy, wallet heuristics.
Policy reverse-engineering	`ShadowPolicyFitter`; IRL interface for v0.3+.

Detection methods

How each detection works, what papers back it, what thresholds are tunable.

Page	Topic
Manipulation detectors	Spoofing, layering, quote-stuffing, wash, momentum ignition, iceberg.
Flow toxicity	VPIN, OFI, PIN, Hawkes branching ratio.
Hidden orders	Split-order linking + execution-algo fingerprinting.
Clustering	HDBSCAN behavioral, DTW temporal, Louvain network.

Integration & tooling

Plumbing: persistence, CLI, and how findings plug into the rest of Horizon.

Page	Topic
Flow store	`SQLiteFlowStore` schema, WORM trigger, retrieval API.
CLI	`horizon flow scan / profile / anomalies / cluster / reverse / verify`.
Risk integration	`FlowAnomalyCheck` for `RiskConfig.extra_checks`; alerter wiring.

Dependencies

horizon.flow itself imports only the SDK core. Optional capabilities are gated behind extras groups so the base install stays lean:

Extras	Purpose	Installs
`horizon[flow]`	Full flow module	scikit-learn, hdbscan, python-louvain, networkx, dtaidistance, web3, websockets
`horizon[flow-irl]`	IRL policy recovery (v0.3+)	torch, gymnasium, stable-baselines3

All optional dependencies import inside try / except ImportError guards. Missing deps degrade specific capabilities (behavioral clustering skips without hdbscan; Polymarket live connect raises a clear error without websockets) but the core module and the default shadow-policy path still work.

What flow is NOT

NOT a real-time HFT defense. Detectors run in the ingest hot path but are designed for signal quality, not microsecond latency. The module’s time horizon is seconds to minutes.
NOT a replacement for own-firm compliance. Keep horizon.compliance wired for PDT, suitability, restrictions, and pre-clearance. Flow adds a market-intelligence layer alongside them.
NOT a cross-venue wallet attribution system (yet). Identifying the same entity on Polymarket and Hyperliquid is deferred to v1.0 and will be opt-in.
NOT a replacement for your own diligence. Findings are informational. Treat them as inputs to decisions the fiduciary still makes.