Roadmap
What's shipped in v0.1, what's coming through v1.0, and what's deliberately out of scope.
horizon.flow ships in phases. v0.1 is what landed with the first commit; each subsequent version adds capability without breaking the previous release’s API.
v0.1: Foundations + Polymarket / Kalshi / Hyperliquid
Shipped. What you can do today:
- Ingest public CLOB + on-chain events from Polymarket, Kalshi, Hyperliquid, and from any
LiveFeedorAuditLogvia observers. - Profile actors incrementally with rolling features (order-to-trade ratio, inter-arrival CV, Hawkes branching proxy, maker ratio, gas-price fingerprint on Polygon).
- Classify each actor via a Kirilenko et al. (2017) 6-category soft-label: HFT, opportunistic, fundamental buyer/seller, small, intermediary.
- Cluster related wallets via four orthogonal methods: Meiklejohn / Victor heuristics for Polygon, HDBSCAN on features, DTW on inter-event timing, Louvain on the co-trading graph.
- Detect six manipulation patterns: spoofing (Lee-Eom-Park 2013), layering (FINRA 5210), quote-stuffing (Egginton 2016), wash trading (Cong-Li-Tang-Yang 2023), momentum ignition, iceberg reloads (Hautsch-Huang 2012).
- Cross-validate with flow-toxicity measures: VPIN, OFI, PIN, Hawkes branching ratio.
- Reverse-engineer a per-actor shadow policy. Decision-tree + gradient-boosted classifier + SHAP feature attribution. Yielding human-readable rules.
- Persist every finding in
SQLiteFlowStore(WORM trigger on the anomalies table) AND in the hash-chainedAuditLog. - Query via
hz.flow.actor_profile(),hz.flow.anomalies(),hz.flow.cluster_of(),hz.flow.shadow_policy()or the CLI. - Gate trades via
FlowAnomalyCheckinRiskConfig.extra_checks.
Test coverage: 92 flow-specific tests (paper-reproduction for VPIN and Kirilenko taxonomy; threshold-sensitivity per detector; shadow-policy end-to-end; clustering on bimodal and co-trading populations).
v0.2: Equities / options / perps with graceful degradation
Partially shipped. Tape-level ingestion for venues that don’t expose counterparty identities.
Shipped in v0.2.0:
ActorFeatureExtractortoleratesactor_id=Noneby aggregating intoanon_{market}_{window}pseudo-actors (default 5-minute windows). Kirilenko taxonomy applies to the bucket’s aggregate behavior. Read as “this market/window is HFT-dominated” rather than “this wallet is HFT.”AlpacaFlowSource. Wraps the existingAlpacaLiveFeedand emits normalizedMarketEvents withactor_id=None. Template for IBKR / CCXT sources to follow.- Detectors handle anonymous tape cleanly: actor-scoped ones (spoofing, layering, momentum-ignition, split-order) skip; market-level ones (iceberg, wash-trade, quote-stuffing, toxicity) still fire.
- New recipe page: Equities & options tape.
- Tests: 13 new tests covering anonymization, graceful skip, AlpacaFlowSource wiring.
Coming in v0.2.1:
IBKRFlowSourcefollowing the same template.CCXTFlowSourcefor crypto exchanges via CCXT.- Fully anonymized-path behavioral audit alongside the existing wallet-level one.
What stays unchanged from v0.1: all public APIs, the audit category set, the flow-store schema. A v0.1 deployment picks up v0.2 semantics automatically; set FlowConfig.actors.anonymize_window_s = 0.0 to preserve the v0.1 skip behavior.
v0.3: Inverse RL
Partially shipped.
Shipped in v0.3.0:
MaxEntIRLFitterfully implemented. Ziebart et al. (2008) with discretized state space, empirical transitions + Laplace smoothing, soft value iteration, and gradient descent on a per-(state, action) reward basis.- Pure numpy. No torch dependency. Runs on the base
horizoninstall; no extras required. - 8 tests including a Ziebart-style 5×5 gridworld paper-reproduction: 400 expert trajectories toward the goal corner → recovered reward’s argmax sits at the goal. Deterministic.
- PolicyModel output: per-action reward weights, top-rewarding (state, action) readouts with human-readable bin centers, log-likelihood, convergence diagnostics, round-trips cleanly through the flow store.
- Doc update at Policy reverse-engineering.
Shipped in v0.3.1 ([flow-irl] extras, torch):
GAILFitter. Ho & Ermon (2016), offline variant. Fits a torch discriminatorD(s, a)to distinguish expert demonstrations from random-policy samples, reports the learned rewardr = log D - log(1 - D)with per-action preferences and feature-gradient importance. Offline because markets have no rewindable simulator. We drop the on-policy TRPO/PPO loop and keep the discriminator-as-reward formulation.AIRLFitter. Fu, Luo, Levine (2018), offline variant. Extends GAIL with the state-only reward decompositionD(s, a) = exp(f(s)) / (exp(f(s)) + 1/|A|). Outputf(s)identifies market states the actor values regardless of action. Useful for reward transfer.- 8 tests (refusal, shape, directional recovery, roundtrip, determinism, AIRL top-k ordering).
- Both require torch via the
[flow-irl]extras; gracefulModuleNotFoundErrorwithout it.
Users who want interpretable output today (most compliance cases) stay on the default shadow policy path; MaxEnt / GAIL / AIRL offer progressively richer views of the same demonstration set.
v0.4: ML-augmented anomaly detection
Partially shipped.
Shipped in v0.4.0:
IsolationForestDetector. Liu, Ting, Zhou (2008). sklearn-based; no torch dependency, no new extras required.- Scores a 5-dimensional market-state vector (spread, depth imbalance, multi-horizon mid returns, realized vol) against a forest fit on a burn-in window of that market’s normal activity. Flags statistical outliers as
AnomalyCategory.MarketAnomaly. - Complements the rule-based detectors. Rules catch known patterns, Isolation Forest catches the long tail. Both layers run simultaneously in a production engine.
- Cooldown + periodic refit so long-running deployments adapt to regime drift without emitting hundreds of duplicate findings.
- Optional
prefit()path skips burn-in when historical training data is available. - 7 tests including flash-event detection, cooldown suppression, and full FlowEngine integration.
- New doc page: ML anomaly detection.
Shipped in v0.4.1 ([flow-ml] extras, torch):
AutoencoderDetector. Dixon, Halperin, Bilokon (2020). Symmetric MLP encoder-decoder on a 9-dim market-state vector; reconstruction MSE as the anomaly score. Catches nonlinear structural anomalies IsolationForest’s univariate-leaning partitions miss.- Same lifecycle as IsolationForest: burn-in → fit → score → cooldown → periodic refit. Optional
prefit()skips burn-in when historical training data is available. - 6 tests including end-to-end FlowEngine integration.
- Run both ML detectors in parallel; each emits findings under its own
detector_nameso downstream filters can treat them differently.
Coming later:
- Transformer-based order-flow models when the research tier is worth the compute. Research-grade; expect longer iteration.
Precision / recall is compared against the classical detectors on recorded data so the user has an honest trade-off report before flipping the switch.
v1.0: Cross-venue + hardening
Planned. Shipping criteria:
- Cross-venue wallet attribution: link a Polymarket wallet to a Hyperliquid address via same-address heuristics and optional third-party enrichment (Nansen, Arkham). Opt-in with a terms-of-service review step.
- Regulatory-report templates: Reg BI, MAR, MiFID II surveillance report renderers that consume the flow store.
- Performance hardening: less than 100 μs observer overhead at 1 kHz event rate; flow store supports 10 M events per day without degradation.
- The full test target of ≥ 200 flow tests.
Out of scope
These requests come up; stating them clearly so expectations are calibrated:
- Attributing wallets to real people. Findings stay pseudonymous. Linking a wallet to a legal entity requires subpoenaed exchange KYC data or compliance-operated KYT tooling, neither of which belongs in an open SDK.
- Front-running or co-trading the detected bots. Ethically and legally distinct from surveillance. The module does not ship “copy the bot” helpers.
- Real-time HFT defense at microsecond latency. Detectors are designed for second-to-minute horizons. Use dedicated FPGA / colo infrastructure for microsecond-grade protection.
- Replacing
horizon.compliance. Own-firm policing stays where it is. Flow is market intelligence, not compliance.
Version compatibility
Additions are strictly additive. A v0.2 flow store reads v0.1 records; v0.1 code continues to work on a v0.2 install. Deprecations, if any ever happen, are announced at least one minor version in advance with a working deprecation shim.