Equities & options tape (v0.2)
Running flow against anonymous tape. Alpaca, IBKR, CCXT. What works, what doesn't, and how to interpret market-level pseudo-profiles.
v0.2 adds tape-level ingestion for venues that don’t expose counterparty identities. The SDK treats those venues as “anonymous flow” and degrades gracefully: everything that works at the market level still works; actor-scoped detectors and per-actor taxonomy classification run on synthetic per-(market, time-window) pseudo-actors.
This is the recipe for running flow against Alpaca, IBKR, or CCXT. Anywhere the consolidated tape is what you see, not a wallet.
What you get and don’t get on anonymous tape
Side-by-side with the v0.1 Polymarket / Hyperliquid path:
| Capability | Polymarket / HL (v0.1) | Alpaca / IBKR / CCXT (v0.2) |
|---|---|---|
| Per-wallet profiling | Yes | No (bucketed per-market-per-window) |
| Kirilenko taxonomy on a specific actor | Yes | No (applies to the bucket’s aggregate behavior) |
| Wallet clustering | Yes | N/A |
| Spoofing / layering / momentum-ignition / split-order | Yes (actor-scoped) | Skipped (no actor to attribute to) |
| Quote-stuffing (market-level) | Yes | Yes |
| Wash-trade (round-number + Benford) | Yes | Yes |
| Iceberg reload (book-level) | Yes | Yes |
| VPIN / OFI / Hawkes | Yes | Yes |
| Shadow-policy per actor | Yes | No (nothing to fit against) |
FlowAnomalyCheck market-gate | Yes | Yes (fires on market-level findings) |
In short: market-level signals are first-class; actor-level ones degrade to skips. The defense + alpha recipes in Defend / Alpha that rely only on market-level findings (recipes 1, 3, 4 on both pages) work unchanged.
The synthetic pseudo-actor model
When a MarketEvent arrives with actor_id=None, the ActorFeatureExtractor derives a synthetic key:
anon_{market_id}_{window_index}
where window_index = floor(event_timestamp / FlowConfig.actors.anonymize_window_s) (default 300 seconds = 5-minute windows).
Every anonymous event on the same market in the same 5-minute window accumulates into the same pseudo-actor. After enough events, that pseudo-actor gets:
- A full
ActorFeaturesvector summarizing market activity in the window. - A Kirilenko taxonomy probability vector: “this market/window was 60% HFT-dominated, 25% opportunistic, 10% fundamental, 5% other.”
- An entry in the flow store like any other profile.
Read these as characterizations of the market itself during that window, not of a specific counterparty. The language is different but the downstream machinery (store, audit, CLI) is identical.
Turn it off by setting anonymize_window_s = 0.0 in FlowConfig.actors. None-actor events are then silently skipped (v0.1 behavior).
Recipe: live Alpaca surveillance
from horizon.audit import AuditLog, SQLiteSink
from horizon.data.live.alpaca_ws import AlpacaLiveFeed
from horizon.data.live.base import SubscriptionKind
from horizon.flow import SQLiteFlowStore, make_default_engine
from horizon.flow.ingestion.alpaca import AlpacaFlowSource
# 1. Standard stores
flow_store = SQLiteFlowStore("/var/horizon/alpaca_flow.db")
audit_log = AuditLog(sink=SQLiteSink("/var/horizon/alpaca_audit.db"))
# 2. Engine: note venue_name="alpaca" so findings group correctly
engine = make_default_engine(
venue_name="alpaca",
store_path="/var/horizon/alpaca_flow.db",
audit_log=audit_log,
)
# 3. The Alpaca feed: trades + quotes on your watchlist
feed = AlpacaLiveFeed(api_key="...", api_secret="...")
feed.subscribe(["AAPL", "MSFT", "NVDA", "TSLA"], SubscriptionKind.Trades)
feed.subscribe(["AAPL", "MSFT", "NVDA", "TSLA"], SubscriptionKind.Quotes)
# 4. Ingestion source: wraps the feed, emits actor_id=None events
source = AlpacaFlowSource(feed=feed, connect_feed=True)
source.on_event(engine.ingest)
source.connect()
# Done: findings flow into the store and audit log as trades + quotes arrive.
Note connect_feed=True: when the flow source is the ONLY consumer of the feed, let it manage the feed’s lifecycle. If you already have a trading pipeline that owns the feed, use connect_feed=False (the default). The source just adds itself as an extra on_tick handler.
Recipe: pseudo-actor reports
The CLI works against anon keys the same as named ones:
horizon flow anomalies --db=/var/horizon/alpaca_flow.db \
--market=AAPL --since-hours=24 --limit=100 | jq
# Dump the 5-min pseudo-profile for a specific window
horizon flow profile --db=/var/horizon/alpaca_flow.db \
--venue=alpaca \
--actor=anon_AAPL_5925487 | jq
To map a timestamp to the right window index:
import datetime
def anon_key(market_id: str, at: datetime.datetime, window_s: float = 300.0) -> str:
return f"anon_{market_id}_{int(at.timestamp() // window_s)}"
# e.g., pseudo-actor covering AAPL at 14:32 UTC on 2026-05-01 with 5-min windows
key = anon_key("AAPL", datetime.datetime(2026, 5, 1, 14, 32, tzinfo=datetime.timezone.utc))
print(key) # anon_AAPL_5925486
Recipe: what DOES fire on anonymous tape
The detectors that fire without actor attribution:
- Iceberg. Watches visible depth reload across the book. Works on Alpaca’s L2 snapshots as well as Polymarket’s CLOB. detectors/iceberg.
- Wash trade. Flags round-number + Benford-weird size distribution on the market’s tape. Equity tape cleaner than crypto by default, so thresholds may need adjustment.
- Quote stuffing. Counts placements + cancels per market per time window; fires on message-rate spikes with low fill ratio.
- Market-level toxicity. VPIN / OFI / Hawkes on the tape + quote stream.
And the ones that silently skip:
- Spoofing / Layering. Require same-actor placement+cancel pattern. No actor, no fire.
- Momentum ignition. Requires same-actor aggressor+reversal. Skips.
- Split order. Requires same-actor child-chain linking. Skips.
Recipe: market-regime filter from a pseudo-actor
The most useful v0.2 output is the Kirilenko distribution per market-window. Use it as a regime filter in a strategy:
import horizon as hz
from datetime import datetime, timezone
def market_regime(market_id: str, *, venue="alpaca",
window_s: float = 300.0) -> dict[str, float]:
"""Return the taxonomy probs for the current 5-min window, or {} if
the pseudo-actor hasn't accumulated enough events yet."""
now = datetime.now(timezone.utc)
window_idx = int(now.timestamp() // window_s)
key = f"anon_{market_id}_{window_idx}"
profile = hz.flow.actor_profile(key, venue=venue)
return profile.taxonomy_probs if profile else {}
class RegimeAwareStrategy(hz.Strategy):
def evaluate(self, f, universe):
signals = []
for market in universe:
probs = market_regime(market.id)
hft_pct = probs.get("hft", 0.0)
if hft_pct > 0.5:
# Majority-HFT regime → widen spreads, reduce size
size_mult = 0.5
else:
size_mult = 1.0
if f.z[market.id] < -2.0:
signals.append(hz.Signal.increase(
market, edge_bps=int(40 * size_mult), horizon="1h",
))
return signals
The pseudo-actor smooths over 5 minutes of activity, so the regime reading is stable enough to act on (not flipping every tick) while still reflecting current conditions (not yesterday’s).
Recipe: compare toxicity between venues
Anonymous tape lets you compare markets ACROSS venues. Since the pseudo-actor model is symmetric, a VPIN / OFI / Hawkes reading on Alpaca AAPL is directly comparable to the same reading on IBKR AAPL or Hyperliquid BTC-PERP.
import horizon as hz
from datetime import datetime, timedelta, timezone
markets = [
("alpaca", "AAPL"),
("alpaca", "TSLA"),
("hyperliquid", "BTC"),
]
since = datetime.now(timezone.utc) - timedelta(hours=1)
for venue, market_id in markets:
anomalies = hz.flow.anomalies(
market_id=market_id, since=since, limit=100,
)
wash = sum(1 for a in anomalies if a.category.value == "wash_trade")
stuff = sum(1 for a in anomalies if a.category.value == "quote_stuffing")
print(f"{venue:12s} {market_id:6s} wash={wash:3d} stuff={stuff:3d}")
If you’re allocating across venues, lower toxicity = better execution for the same alpha.
Limitations
Some things v0.2 does NOT give you:
- Broker-specific fill attribution. Alpaca reports “this trade happened on this exchange” in the tick, but not “my fills came from wallet X.” The venue layer knows that for your own orders; the flow layer can’t attribute other participants’ fills.
- Dark-pool visibility. Tape doesn’t include dark-pool prints until they report (often delayed). Flow sees only what the SIP shows.
- Options chain context. This release ships the ingestion for equity underlyings only. An options-aware extension (Greek context, IV regime) is v0.3+.
- IBKR / CCXT sources. The pattern is the same as
AlpacaFlowSource; those adapters follow in upcoming iterations.
Migrating from v0.1
No breaking changes. An existing v0.1 deployment picks up v0.2 behavior automatically. None-actor events that were previously skipped are now bucketed. If you prefer v0.1 semantics (skip anonymous events), set:
from dataclasses import replace
cfg = hz.flow.FlowConfig()
cfg.actors = replace(cfg.actors, anonymize_window_s=0.0)
Otherwise, expect your flow store to start accumulating anon_* profiles at a rate proportional to the volume of anonymous tape you’re consuming. Storage is bounded by (n_markets × duration / window_s), so 100 markets × 24 hours × 288 windows/day ≈ 29k profiles/day. Manageable for a year’s worth of data in a single SQLite file.
Related
- Defend recipes. Recipes 1, 3, 4 work on anonymous tape.
- Alpha recipes. Recipes 1, 3, 5 work on anonymous tape.
- Roadmap. What v0.2 ships vs v0.3 plans.
- Actor profiling. Reference for the profile fields and taxonomy.