Flow Surveillance
Counterparty bot detection, manipulation surveillance, and policy reverse-engineering. Opt-in module that sits on top of the core SDK without modifying it.
horizon.flow is a separate opt-in module for observing OTHER participants in a venue. Detecting bots, flagging manipulation patterns (spoofing, layering, quote-stuffing, wash trading, momentum ignition, iceberg orders), clustering coordinated wallets, and reverse-engineering the policies of observable traders.
It is NOT a replacement for horizon.compliance, which polices your own firm for PDT, suitability, restrictions, and pre-clearance. Flow is about the market; compliance is about you.
Start here: I want to…
Pick the page that matches what you’re trying to do:
- Try it for the first time → Quickstart. Zero to a detected spoof in 15 minutes, no live data needed.
- Understand what’s happening inside → How it works. Mental model, event flow, the seven steps that run on every event.
- Protect my strategy from toxic flow → Defend recipes. Gate orders on findings, reduce size against HFT counterparties, blacklist clusters.
- Use flow signals as alpha → Alpha recipes. Fade wash-pumps, widen spreads via Kirilenko probabilities, anticipate a reverse-engineered bot.
- Investigate a wallet or market → Investigation recipes. Deep-dive reports, findings exports, policy fitting, cluster analysis.
- Run this in production → Operations recipes. Live wiring, cron scans, Slack/PagerDuty routing, debugging, scale.
- Look up a specific class or config → the reference pages below (events, actors, detectors, toxicity, clustering, hidden, policy, store, CLI, risk integration).
Why this exists
If you are trading on Polymarket, Kalshi, or Hyperliquid, you are trading against bots. On-chain and DEX venues expose counterparty wallets, which makes actor-level attribution feasible in a way equity markets do not allow. horizon.flow turns that on-chain visibility into structured findings a fiduciary can defend.
Three use cases:
- Defense. Don’t trade against a known toxic counterparty. Pull an actor profile before sizing, or let
FlowAnomalyCheckreject orders in markets with active spoofing or wash findings. - Alpha. Identify bots that react to predictable triggers; infer their shadow policy (decision-tree rules with SHAP feature attribution); anticipate their flow.
- Record-keeping. Every finding emits into the existing hash-chained audit log and a queryable
SQLiteFlowStore. A compliance reviewer or regulator can reconstruct what you knew and when.
V0.1 venue targets
The first release targets venues that expose counterparty identities:
- Polymarket. CLOB WebSocket + Polygon RPC enrichment. Wallets visible per trade; on-chain tx hash, gas price, nonce, block number are all capturable.
- Hyperliquid. L3 orderbook WebSocket exposes wallet per resting order and fill.
- Kalshi. Public market data (tape + orderbook). No counterparty IDs here, but market-level signals (VPIN, OFI, quote-stuffing, iceberg) still work.
Equities, options, and perps beyond Hyperliquid land in v0.2 with graceful degradation. Tape-level signals only, no per-actor attribution.
Integration seams
horizon.flow plugs into the existing SDK at four seams, none of which require core modifications:
AuditLog subscription
from horizon.audit import AuditLog, SQLiteSink
from horizon.flow import FlowObserver, make_default_engine
log = AuditLog(sink=SQLiteSink("audit.db"))
engine = make_default_engine(venue_name="polymarket", store_path="flow.db", audit_log=log)
log.subscribe(FlowObserver(venue_name="polymarket", on_event=engine.ingest))
LiveFeed tick handler
FlowFeedHandler subscribes to any LiveFeed via feed.on_tick(handler). Quote, trade, and orderbook ticks normalize into the same MarketEvent stream.External venue ingestion
PolymarketFlowSource, KalshiFlowSource, HyperliquidFlowSource, and ReplayFlowSource connect directly to the venues' public feeds (CLOB WS, L3 book) and feed events into the engine.Risk-pipeline rejection (optional)
FlowAnomalyCheck plugs into RiskConfig.extra_checks to reject orders in markets with active anomaly findings above a configured severity.Findings emit to the existing AuditLog as new AuditCategory members (FlowAnomaly, ActorProfiled, ClusterAssigned, PolicyInferred, BotDetected), so they live in the same hash-chained regulatory record as orders and risk decisions.
Quick start
import horizon as hz
from horizon.flow import SQLiteFlowStore, set_default_store, make_default_engine
from horizon.flow.ingestion import ReplayFlowSource
# 1. One-time setup. Default engine comes wired with all 7 detectors,
# the actor feature extractor, Kirilenko taxonomy, and the store.
store = SQLiteFlowStore("flow.db")
set_default_store(store)
engine = make_default_engine(venue_name="polymarket", store_path="flow.db")
# 2. Feed events. For tests, replay a recorded JSONL fixture.
# For live, swap in PolymarketFlowSource / HyperliquidFlowSource.
source = ReplayFlowSource(path="polymarket_day.jsonl")
source.on_event(engine.ingest)
source.connect()
# 3. Query findings.
for finding in hz.flow.anomalies(market_id="0xTRUMP..."):
print(finding.category.value, finding.confidence, finding.message)
# 4. Inspect an actor.
profile = hz.flow.actor_profile("0xabc...", venue="polymarket")
print(profile.taxonomy_probs) # {'hft': 0.72, 'opportunistic': 0.18, ...}
print(profile.features.hawkes_branching_ratio)
What’s in this section
The Flow Surveillance tab is organized into five sub-groups. Use the sidebar to jump between them; here’s the map.
Start here
Orientation and first-run material. Read in order if you’re new.
| Page | Topic |
|---|---|
| Overview (this page) | What the module is, where it fits, who should use it. |
| Quickstart | Zero to a first finding in 15 minutes. Copy-pasteable tutorial. |
| How it works | Internal pipeline and the seven steps per event. |
| Roadmap | v0.1 through v1.0 phasing; what’s shipped, what’s next. |
Recipes (I want to…)
Concrete worked examples per goal. Read whichever matches your task.
| Page | Topic |
|---|---|
| Defend | Protect strategies from toxic counterparties. Five recipes. |
| Alpha | Use flow as an input signal, not just a filter. Five recipes. |
| Investigate | Forensic use of the flow store. Five recipes. |
| Operations | Production deployment, cron, alerting, debugging. Five recipes. |
Core concepts
The data model plus the two main derived artifacts. Read when you need to understand what a particular record means.
| Page | Topic |
|---|---|
| Data model | MarketEvent, AnomalyFinding, ActorProfile, WalletCluster, PolicyModel. |
| Actor profiling | ActorFeatureExtractor, Kirilenko taxonomy, wallet heuristics. |
| Policy reverse-engineering | ShadowPolicyFitter; IRL interface for v0.3+. |
Detection methods
How each detection works, what papers back it, what thresholds are tunable.
| Page | Topic |
|---|---|
| Manipulation detectors | Spoofing, layering, quote-stuffing, wash, momentum ignition, iceberg. |
| Flow toxicity | VPIN, OFI, PIN, Hawkes branching ratio. |
| Hidden orders | Split-order linking + execution-algo fingerprinting. |
| Clustering | HDBSCAN behavioral, DTW temporal, Louvain network. |
Integration & tooling
Plumbing: persistence, CLI, and how findings plug into the rest of Horizon.
| Page | Topic |
|---|---|
| Flow store | SQLiteFlowStore schema, WORM trigger, retrieval API. |
| CLI | horizon flow scan / profile / anomalies / cluster / reverse / verify. |
| Risk integration | FlowAnomalyCheck for RiskConfig.extra_checks; alerter wiring. |
Dependencies
horizon.flow itself imports only the SDK core. Optional capabilities are gated behind extras groups so the base install stays lean:
| Extras | Purpose | Installs |
|---|---|---|
horizon[flow] | Full flow module | scikit-learn, hdbscan, python-louvain, networkx, dtaidistance, web3, websockets |
horizon[flow-irl] | IRL policy recovery (v0.3+) | torch, gymnasium, stable-baselines3 |
All optional dependencies import inside try / except ImportError guards. Missing deps degrade specific capabilities (behavioral clustering skips without hdbscan; Polymarket live connect raises a clear error without websockets) but the core module and the default shadow-policy path still work.
What flow is NOT
- NOT a real-time HFT defense. Detectors run in the ingest hot path but are designed for signal quality, not microsecond latency. The module’s time horizon is seconds to minutes.
- NOT a replacement for own-firm compliance. Keep
horizon.compliancewired for PDT, suitability, restrictions, and pre-clearance. Flow adds a market-intelligence layer alongside them. - NOT a cross-venue wallet attribution system (yet). Identifying the same entity on Polymarket and Hyperliquid is deferred to v1.0 and will be opt-in.
- NOT a replacement for your own diligence. Findings are informational. Treat them as inputs to decisions the fiduciary still makes.