Flow Surveillance

Counterparty bot detection, manipulation surveillance, and policy reverse-engineering. Opt-in module that sits on top of the core SDK without modifying it.

horizon.flow is a separate opt-in module for observing OTHER participants in a venue. Detecting bots, flagging manipulation patterns (spoofing, layering, quote-stuffing, wash trading, momentum ignition, iceberg orders), clustering coordinated wallets, and reverse-engineering the policies of observable traders.

It is NOT a replacement for horizon.compliance, which polices your own firm for PDT, suitability, restrictions, and pre-clearance. Flow is about the market; compliance is about you.

Start here: I want to…

Pick the page that matches what you’re trying to do:

  • Try it for the first timeQuickstart. Zero to a detected spoof in 15 minutes, no live data needed.
  • Understand what’s happening insideHow it works. Mental model, event flow, the seven steps that run on every event.
  • Protect my strategy from toxic flowDefend recipes. Gate orders on findings, reduce size against HFT counterparties, blacklist clusters.
  • Use flow signals as alphaAlpha recipes. Fade wash-pumps, widen spreads via Kirilenko probabilities, anticipate a reverse-engineered bot.
  • Investigate a wallet or marketInvestigation recipes. Deep-dive reports, findings exports, policy fitting, cluster analysis.
  • Run this in productionOperations recipes. Live wiring, cron scans, Slack/PagerDuty routing, debugging, scale.
  • Look up a specific class or config → the reference pages below (events, actors, detectors, toxicity, clustering, hidden, policy, store, CLI, risk integration).

Why this exists

If you are trading on Polymarket, Kalshi, or Hyperliquid, you are trading against bots. On-chain and DEX venues expose counterparty wallets, which makes actor-level attribution feasible in a way equity markets do not allow. horizon.flow turns that on-chain visibility into structured findings a fiduciary can defend.

Three use cases:

  • Defense. Don’t trade against a known toxic counterparty. Pull an actor profile before sizing, or let FlowAnomalyCheck reject orders in markets with active spoofing or wash findings.
  • Alpha. Identify bots that react to predictable triggers; infer their shadow policy (decision-tree rules with SHAP feature attribution); anticipate their flow.
  • Record-keeping. Every finding emits into the existing hash-chained audit log and a queryable SQLiteFlowStore. A compliance reviewer or regulator can reconstruct what you knew and when.

V0.1 venue targets

The first release targets venues that expose counterparty identities:

  • Polymarket. CLOB WebSocket + Polygon RPC enrichment. Wallets visible per trade; on-chain tx hash, gas price, nonce, block number are all capturable.
  • Hyperliquid. L3 orderbook WebSocket exposes wallet per resting order and fill.
  • Kalshi. Public market data (tape + orderbook). No counterparty IDs here, but market-level signals (VPIN, OFI, quote-stuffing, iceberg) still work.

Equities, options, and perps beyond Hyperliquid land in v0.2 with graceful degradation. Tape-level signals only, no per-actor attribution.

Integration seams

horizon.flow plugs into the existing SDK at four seams, none of which require core modifications:

AuditLog subscription

`FlowObserver` subscribes to your existing `AuditLog` via `log.subscribe(observer)`. Every order lifecycle event flows through the normalization layer into a `MarketEvent`.
python
from horizon.audit import AuditLog, SQLiteSink
from horizon.flow import FlowObserver, make_default_engine

log = AuditLog(sink=SQLiteSink("audit.db"))
engine = make_default_engine(venue_name="polymarket", store_path="flow.db", audit_log=log)
log.subscribe(FlowObserver(venue_name="polymarket", on_event=engine.ingest))

LiveFeed tick handler

FlowFeedHandler subscribes to any LiveFeed via feed.on_tick(handler). Quote, trade, and orderbook ticks normalize into the same MarketEvent stream.

External venue ingestion

PolymarketFlowSource, KalshiFlowSource, HyperliquidFlowSource, and ReplayFlowSource connect directly to the venues' public feeds (CLOB WS, L3 book) and feed events into the engine.

Risk-pipeline rejection (optional)

FlowAnomalyCheck plugs into RiskConfig.extra_checks to reject orders in markets with active anomaly findings above a configured severity.

Findings emit to the existing AuditLog as new AuditCategory members (FlowAnomaly, ActorProfiled, ClusterAssigned, PolicyInferred, BotDetected), so they live in the same hash-chained regulatory record as orders and risk decisions.

Quick start

python
import horizon as hz
from horizon.flow import SQLiteFlowStore, set_default_store, make_default_engine
from horizon.flow.ingestion import ReplayFlowSource

# 1. One-time setup. Default engine comes wired with all 7 detectors,
# the actor feature extractor, Kirilenko taxonomy, and the store.
store = SQLiteFlowStore("flow.db")
set_default_store(store)
engine = make_default_engine(venue_name="polymarket", store_path="flow.db")

# 2. Feed events. For tests, replay a recorded JSONL fixture.
# For live, swap in PolymarketFlowSource / HyperliquidFlowSource.
source = ReplayFlowSource(path="polymarket_day.jsonl")
source.on_event(engine.ingest)
source.connect()

# 3. Query findings.
for finding in hz.flow.anomalies(market_id="0xTRUMP..."):
 print(finding.category.value, finding.confidence, finding.message)

# 4. Inspect an actor.
profile = hz.flow.actor_profile("0xabc...", venue="polymarket")
print(profile.taxonomy_probs) # {'hft': 0.72, 'opportunistic': 0.18, ...}
print(profile.features.hawkes_branching_ratio)

What’s in this section

The Flow Surveillance tab is organized into five sub-groups. Use the sidebar to jump between them; here’s the map.

Start here

Orientation and first-run material. Read in order if you’re new.

PageTopic
Overview (this page)What the module is, where it fits, who should use it.
QuickstartZero to a first finding in 15 minutes. Copy-pasteable tutorial.
How it worksInternal pipeline and the seven steps per event.
Roadmapv0.1 through v1.0 phasing; what’s shipped, what’s next.

Recipes (I want to…)

Concrete worked examples per goal. Read whichever matches your task.

PageTopic
DefendProtect strategies from toxic counterparties. Five recipes.
AlphaUse flow as an input signal, not just a filter. Five recipes.
InvestigateForensic use of the flow store. Five recipes.
OperationsProduction deployment, cron, alerting, debugging. Five recipes.

Core concepts

The data model plus the two main derived artifacts. Read when you need to understand what a particular record means.

PageTopic
Data modelMarketEvent, AnomalyFinding, ActorProfile, WalletCluster, PolicyModel.
Actor profilingActorFeatureExtractor, Kirilenko taxonomy, wallet heuristics.
Policy reverse-engineeringShadowPolicyFitter; IRL interface for v0.3+.

Detection methods

How each detection works, what papers back it, what thresholds are tunable.

PageTopic
Manipulation detectorsSpoofing, layering, quote-stuffing, wash, momentum ignition, iceberg.
Flow toxicityVPIN, OFI, PIN, Hawkes branching ratio.
Hidden ordersSplit-order linking + execution-algo fingerprinting.
ClusteringHDBSCAN behavioral, DTW temporal, Louvain network.

Integration & tooling

Plumbing: persistence, CLI, and how findings plug into the rest of Horizon.

PageTopic
Flow storeSQLiteFlowStore schema, WORM trigger, retrieval API.
CLIhorizon flow scan / profile / anomalies / cluster / reverse / verify.
Risk integrationFlowAnomalyCheck for RiskConfig.extra_checks; alerter wiring.

Dependencies

horizon.flow itself imports only the SDK core. Optional capabilities are gated behind extras groups so the base install stays lean:

ExtrasPurposeInstalls
horizon[flow]Full flow modulescikit-learn, hdbscan, python-louvain, networkx, dtaidistance, web3, websockets
horizon[flow-irl]IRL policy recovery (v0.3+)torch, gymnasium, stable-baselines3

All optional dependencies import inside try / except ImportError guards. Missing deps degrade specific capabilities (behavioral clustering skips without hdbscan; Polymarket live connect raises a clear error without websockets) but the core module and the default shadow-policy path still work.

What flow is NOT

  • NOT a real-time HFT defense. Detectors run in the ingest hot path but are designed for signal quality, not microsecond latency. The module’s time horizon is seconds to minutes.
  • NOT a replacement for own-firm compliance. Keep horizon.compliance wired for PDT, suitability, restrictions, and pre-clearance. Flow adds a market-intelligence layer alongside them.
  • NOT a cross-venue wallet attribution system (yet). Identifying the same entity on Polymarket and Hyperliquid is deferred to v1.0 and will be opt-in.
  • NOT a replacement for your own diligence. Findings are informational. Treat them as inputs to decisions the fiduciary still makes.