Quickstart
Zero to your first detected spoof in 15 minutes. No live data needed. We synthesize a market.
This tutorial takes you from a fresh install to a working flow engine that detects a spoof in a synthesized market. No Polygon RPC, no API keys, no live venue. Everything runs locally against a generated feed so you can see exactly what goes in and exactly what comes out.
After this you’ll know:
- How to wire an engine with all default detectors.
- What a
MarketEventlooks like on the wire. - How to feed events and read findings.
- How to emit findings to both a store and an audit log.
- Where to look next based on what you want to do.
1. Install
# Core module only
pip install horizon
# With flow extras (clustering, web3, websockets)
pip install 'horizon[flow]'
For this tutorial, the core package is enough. We won’t touch live venues or clustering.
2. Build an engine
# tutorial.py
from horizon.flow import SQLiteFlowStore, make_default_engine, set_default_store
from horizon.audit import AuditLog, InMemorySink
# 2a. Stores: flow findings land here
store_path = "/tmp/tutorial_flow.db"
flow_store = SQLiteFlowStore(store_path)
set_default_store(flow_store)
# 2b. Audit log (in-memory for the tutorial; SQLiteSink for real use)
audit_sink = InMemorySink()
audit_log = AuditLog(sink=audit_sink)
# 2c. Engine wired with all v0.1 detectors + actor profiling + taxonomy
engine = make_default_engine(
venue_name="polymarket",
store_path=store_path,
audit_log=audit_log,
)
print(f"engine ready: {len(engine._detectors)} detectors attached")
Run it:
$ python tutorial.py
engine ready: 7 detectors attached
Seven detectors: spoofing, layering, quote-stuffing, wash-trade, momentum-ignition, iceberg, split-order.
3. Generate a synthetic market with a spoofing pattern
We’ll use the same SyntheticFlowGenerator the test suite uses. It produces a reproducible event stream with injectable manipulation patterns.
from datetime import datetime, timezone
from tests.flow.conftest import SyntheticFlowGenerator
# A 10-minute window of activity on one Polymarket market
gen = SyntheticFlowGenerator(
seed=42,
venue="polymarket",
market_id="0xTRUMP_2024",
start=datetime(2026, 4, 20, 12, 0, 0, tzinfo=timezone.utc),
)
# 5 background traders doing normal activity
gen.add_background_actors(n=5, duration_s=600, trades_per_actor=40)
# Inject a spoofing pattern at t+60s
gen.inject_spoofing(
actor_id="0xSpoofer",
t_offset_s=60.0,
bait_size=5000.0, # large bait
aggressor_size=100.0, # small aggressor (bait is 50x larger)
)
events = gen.generate()
print(f"generated {len(events)} events")
Output:
generated 416 events
If tests/ isn’t on your import path, copy the SyntheticFlowGenerator class from tests/flow/conftest.py into your own module. It’s pure stdlib.
4. Feed the engine
for ev in events:
engine.ingest(ev)
print(f"ingested {engine.ingest_count} events")
Output:
ingested 416 events
5. Query findings
import horizon as hz
findings = hz.flow.anomalies(market_id="0xTRUMP_2024")
print(f"{len(findings)} findings")
for f in findings[:5]:
print(f" {f.category.value:20s} conf={f.confidence:.2f} sev={f.severity.value:8s} actor={f.actor_id}")
print(f" {f.message}")
print(f" cite: {f.citation}")
Output (truncated, order by recency):
3 findings
spoofing conf=0.88 sev=high actor=0xSpoofer
bait buy size=5000.0 canceled 400ms after opposite-side aggressor; ratio=50.0
cite: Lee, Eom, Park 2013. Microstructure-based Manipulation
split_order conf=0.50 sev=medium actor=0xBG_003
5 buy children over 14.2s total=218 @ ~0.500123 (±2.3bps)
cite: Hautsch, Huang 2012. Order splitting / linkage
...
The spoofing finding corresponds exactly to the pattern we injected. The split-order finding is a false positive from the background flow generator. A legitimate signal the detectors emit whenever three same-side trades cluster in time. Real background flow can trigger these; use threshold tuning when you see too many.
6. Inspect the audit log
Findings don’t just live in the flow store. They also emit into the hash-chained audit log.
from horizon.audit.events import AuditCategory
flow_events = [
e for e in audit_sink.read_range()
if e.category in (
AuditCategory.FlowAnomaly,
AuditCategory.BotDetected,
AuditCategory.ActorProfiled,
)
]
for e in flow_events[:5]:
print(f" seq={e.sequence:4d} {e.category.value:25s} {e.message[:70]}")
Output:
seq= 1 flow.actor_profiled actor 0xBG_000 profile refreshed
seq= 2 flow.actor_profiled actor 0xBG_001 profile refreshed
...
seq= 42 flow.anomaly bait buy size=5000.0 canceled 400ms after...
Every finding in the flow store has a corresponding audit event. The audit log is your regulator-grade record; the flow store is your operational query layer.
7. Profile the spoofer
profile = hz.flow.actor_profile("0xSpoofer", venue="polymarket")
if profile:
print(f"events observed: {profile.features.event_count}")
print(f"top categories:")
top = sorted(profile.taxonomy_probs.items(), key=lambda kv: -kv[1])
for cat, prob in top[:3]:
print(f" {cat:25s} {prob:.2%}")
With only 3 events (bait place + aggressor fill + bait cancel), the spoofer is under the minimum-events threshold for full classification. In a realistic stream the spoofer would be surfaced as HFT or Opportunistic by the Kirilenko classifier.
8. What just happened
You wrote zero bytes of detection logic. The engine:
- Normalized 416 synthetic events into its rolling state.
- Ran seven detectors on every event against that state.
- Emitted findings to two stores (flow store + audit log).
- Updated an
ActorProfilefor each actor that crossed the refresh interval.
And it did this deterministically. Run the script again and you get the same findings. The seeded SyntheticFlowGenerator + seeded FlowConfig is a compliance property, not a happy accident.
Next steps
Pick based on what you’re trying to do:
- Protect a strategy from toxic flow → Defend recipes. Gate orders on active findings, avoid counterparties with suspicious profiles.
- Use flow signals as alpha → Alpha recipes. Consume actor profiles in a strategy, reverse-engineer a bot’s policy to anticipate it.
- Investigate a specific wallet / market → Investigation recipes. Export findings, profile a whale, trace a policy.
- Run this in production → Operations recipes. Cron, Slack alerts, live feeds, debugging.
- Understand what the engine is doing internally → How it works. Mental model, event flow, extension points.
Full tutorial script
Copy-paste the full listing for reference:
# tutorial.py: complete end-to-end from zero to findings
from datetime import datetime, timezone
import horizon as hz
from horizon.audit import AuditLog, InMemorySink
from horizon.audit.events import AuditCategory
from horizon.flow import SQLiteFlowStore, make_default_engine, set_default_store
from tests.flow.conftest import SyntheticFlowGenerator
# 1. Stores
flow_store = SQLiteFlowStore("/tmp/tutorial_flow.db")
set_default_store(flow_store)
audit_sink = InMemorySink()
audit_log = AuditLog(sink=audit_sink)
# 2. Engine
engine = make_default_engine(
venue_name="polymarket",
store_path="/tmp/tutorial_flow.db",
audit_log=audit_log,
)
# 3. Generate synthetic market
gen = SyntheticFlowGenerator(
seed=42, venue="polymarket", market_id="0xTRUMP_2024",
start=datetime(2026, 4, 20, 12, 0, 0, tzinfo=timezone.utc),
)
gen.add_background_actors(n=5, duration_s=600, trades_per_actor=40)
gen.inject_spoofing(actor_id="0xSpoofer", t_offset_s=60.0,
bait_size=5000.0, aggressor_size=100.0)
# 4. Ingest
for ev in gen.generate():
engine.ingest(ev)
# 5. Findings
for f in hz.flow.anomalies(market_id="0xTRUMP_2024"):
print(f"{f.category.value} conf={f.confidence:.2f} actor={f.actor_id}")
print(f" {f.message}")
# 6. Audit events
flow_audit = [
e for e in audit_sink.read_range()
if e.category in (AuditCategory.FlowAnomaly, AuditCategory.BotDetected)
]
print(f"\n{len(flow_audit)} flow events in audit log")