Alpha: trade with flow as an input
Use actor profiles, toxicity measures, and reverse-engineered bot policies as alpha signals. Not just defensive filters.
The defensive recipes (Defend) treat flow findings as filters: when the signal is bad, don’t trade. The alpha recipes treat the same data as inputs: the signal tells you something the market hasn’t priced yet.
Five recipes, ordered by how speculative each is. The first two rely on well-established microstructure relationships; the last one is research-grade and requires care before production use.
Recipe 1: fade a wash-trade pump
The one with the highest expected value. When the wash-trade detector fires on a market with a sudden price move, the move is (partially) fake. Inflated by self-trading rather than real demand. It reverts.
import horizon as hz
from datetime import datetime, timedelta, timezone
from horizon.flow.events import AnomalyCategory
class WashFadeStrategy(hz.Strategy):
"""When wash trading is detected and price has moved, fade the move."""
def evaluate(self, f, universe):
signals = []
since = datetime.now(timezone.utc) - timedelta(minutes=5)
for market in universe:
recent_wash = hz.flow.anomalies(
market_id=market.id,
category=AnomalyCategory.WashTrade,
since=since,
)
if not recent_wash:
continue
# Highest-confidence finding drives conviction
top = max(recent_wash, key=lambda w: w.confidence)
if top.confidence < 0.75:
continue
# Direction: which way did the price move?
recent_return_5m = f.mid_return_5m[market.id] # your feature
if recent_return_5m > 0.02: # up move → fade short
signals.append(hz.Signal.decrease(
market, edge_bps=int(50 * top.confidence), horizon="1h",
))
elif recent_return_5m < -0.02:
signals.append(hz.Signal.increase(
market, edge_bps=int(50 * top.confidence), horizon="1h",
))
return signals
Why it works. Wash trading inflates reported volume and nudges price via trade-driven order flow impact (Cont-Kukanov-Stoikov 2014), but without underlying directional demand, the move decays as real counterparties fade it. Cong-Li-Tang-Yang (2023) estimated that wash-inflated price moves in crypto revert within hours.
Risks.
- False positive on the wash detector → you fade a genuine move.
- Wash continues indefinitely → your fade is steamrolled.
Mitigate with recipe 5 from Defend: hard stop-loss at the wash-finding confidence level.
Recipe 2: reduce adverse selection by reading counterparty taxonomy
Adverse selection is the cost of trading against someone who knows more than you. HFTs and informed traders are the primary sources. The Kirilenko taxonomy gives you a probability of each, which you can translate into an expected adverse-selection haircut on your quoted price.
import horizon as hz
from horizon.flow.actors.taxonomy import TraderCategory
class AdverseSelectionAwareMaker(hz.Strategy):
"""Market-making strategy that widens spreads when the top-of-book
counterparty is likely HFT or informed."""
BASE_SPREAD_BPS = 10
def evaluate(self, f, universe):
signals = []
for market in universe:
# Get top 2 wallets on each side of book
bid_wallets = self._top_wallets(market.id, "bid", k=2)
ask_wallets = self._top_wallets(market.id, "ask", k=2)
adverse_hft_prob = self._max_hft_prob(bid_wallets + ask_wallets)
# Widen by up to 2x when adverse HFT probability is high
widening = 1.0 + adverse_hft_prob
effective_spread = self.BASE_SPREAD_BPS * widening
# Quote at that spread
# (details of how you emit maker quotes depend on your pipeline)
signals.extend(self._maker_quotes(market, effective_spread))
return signals
def _max_hft_prob(self, wallets: list[str]) -> float:
best = 0.0
for w in wallets:
profile = hz.flow.actor_profile(w, venue="polymarket")
if profile is None:
continue
hft = profile.taxonomy_probs.get(TraderCategory.HFT.value, 0.0)
best = max(best, hft)
return best
Why it works. HFT market makers earn the spread by being fast enough to adverse-select slower counterparties. If you quote at the same spread they do, they pick you off on the half of the order flow where they have the better signal. The Kirilenko-weighted widening is a first-order compensation.
Calibration. Backtest widening factors in [1.0, 1.5, 2.0] against your historical P&L. A widening = 1 + hft_prob heuristic over-widens when HFT probability is inflated by small samples; cap adverse_hft_prob with min(adverse_hft_prob, 0.6) until you have enough events per actor for the Kirilenko classifier’s event_count >= 100 threshold.
Recipe 3: Hawkes branching as a regime indicator
Hawkes branching ratio distinguishes Poisson-ish from self-exciting flow regimes. Transitions between regimes predict volatility expansions. Useful as a vol-scaling signal in a broader strategy.
import horizon as hz
from horizon.flow.config import HawkesConfig
from horizon.flow.toxicity import HawkesFingerprint
class HawkesRegimeStrategy(hz.Strategy):
"""Increase size when regime is stable (low branching); decrease when
regime tips toward self-excitation (high branching). Flat volatility
produces the highest risk-adjusted return for slow mean-reversion."""
def __init__(self):
super().__init__()
self._hawkes: dict[str, HawkesFingerprint] = dict()
def on_trade(self, market_id: str, timestamp):
h = self._hawkes.setdefault(
market_id,
HawkesFingerprint(HawkesConfig(window_s=300.0, kernel_decay_s=1.0)),
)
h.observe(timestamp)
def evaluate(self, f, universe):
signals = []
for market in universe:
h = self._hawkes.get(market.id)
if h is None:
continue
est = h.estimate()
branching = est.branching_ratio if est else 0.3
# Size inversely proportional to branching
size_mult = 1.5 if branching < 0.2 else 1.0 if branching < 0.5 else 0.5
if f.z[market.id] < -2:
signals.append(hz.Signal.increase(
market,
edge_bps=int(30 * size_mult),
horizon="1d",
))
return signals
Why it works. Filimonov-Sornette (2012) identified near-critical Hawkes regimes (branching → 1) as precursors to flash-crash-like moves. In milder form, elevated branching correlates with realized-vol expansion in the next hour. Trading a slow mean-reversion strategy into such a regime has worse P&L than trading it into a stable one.
This is a regime filter, not a directional signal. It doesn’t tell you which way to trade. It tells you how much to trade.
Recipe 4: consume a reverse-engineered shadow policy
Once you’ve fit a shadow policy for a recurring counterparty, the resulting decision-tree or GBDT can be queried at runtime to predict that counterparty’s next action. This is alpha if the counterparty acts predictably on observable features.
import pickle
import horizon as hz
from horizon.flow.policy.features import PolicyFeatureExtractor, FEATURE_NAMES
class ShadowAwareStrategy(hz.Strategy):
"""Anticipate a known counterparty by querying their shadow policy."""
TARGET_ACTORS = ["0xKnownBot1", "0xKnownBot2"]
def __init__(self):
super().__init__()
self._models = dict() # actor_id -> sklearn GBDT
self._labels = dict()
self._feat = PolicyFeatureExtractor()
for actor in self.TARGET_ACTORS:
policy = hz.flow.shadow_policy(actor)
if policy and policy.method.value == "shadow_gbdt":
blob = pickle.loads(policy.model_blob)
self._models[actor] = blob["gbdt"]
self._labels[actor] = blob["labels"]
def observe_market_event(self, ev):
# Called by your ingestion wiring; forwards into the feature extractor
self._feat.observe(ev)
def evaluate(self, f, universe):
signals = []
for market in universe:
if not self._models:
continue
# For each target actor, predict their next action
for actor in self.TARGET_ACTORS:
gbdt = self._models.get(actor)
if gbdt is None:
continue
state = self._feat.featurize(
actor_id=actor,
market_id=market.id,
now=f.now,
)
X = [[state.get(n, 0.0) for n in FEATURE_NAMES]]
probs = gbdt.predict_proba(X)[0]
labels = self._labels[actor]
# If actor is predicted likely to buy in the next window,
# get in front of them
p_buy = probs[labels.index("buy")] if "buy" in labels else 0.0
if p_buy > 0.7:
signals.append(hz.Signal.increase(
market, edge_bps=50, horizon="1h",
))
return signals
Why it works. If the counterparty’s behavior is well-explained by observable state (OFI, spread, time-of-day), a fitted shadow policy extrapolates. You get in front of their order, capture some of the impact, then exit as they complete.
Risks. This is the most speculative recipe on the page.
- Overfit policies will predict confidently even when the counterparty’s actual trigger isn’t in your feature set.
- Once you’re trading against a reverse-engineered policy, the counterparty’s actions become partly reactive to yours. The policy stops describing them faithfully.
- Regulatory context matters: trading directly on a reverse-engineered counterparty strategy could be interpreted as layering-adjacent by a zealous regulator. Be able to document independent rationale.
Mitigation.
- Validate the shadow policy’s
holdout_accuracyis above 0.8 before trading on it. Below that, the rules are noise. - Refit weekly. Behavioral drift is the norm, not the exception.
- Keep the explicit rules (
policy.summary["rules"]) rather than just the pickled model. If the rule is “buys when OFI_5s above 0.3 AND spread below 10 bps”, trade based on THAT condition rather than the black-box model prediction. Much more defensible and robust.
Recipe 5: flow-weighted cross-market spread
Elegant and the most general. Across a basket of related markets (the candidates of a single prediction event, the legs of a perp-spot arb), weight your exposure by the relative cleanliness of each market’s flow.
import horizon as hz
from horizon.flow.toxicity import HawkesFingerprint
from horizon.flow.config import HawkesConfig
class FlowWeightedBasket(hz.Strategy):
"""Allocate across related markets inversely to flow toxicity."""
BASKET = ["0xMarketA", "0xMarketB", "0xMarketC"] # candidates in one event
def evaluate(self, f, universe):
# Pull current toxicity per market
toxicity = dict()
for mid in self.BASKET:
h = self._hawkes_for(mid)
est = h.estimate() if h else None
toxicity[mid] = est.branching_ratio if est else 0.3
# Weight: cleaner markets get bigger share
# weight ∝ (1 - toxicity)
weights = {mid: max(0.0, 1.0 - t) for mid, t in toxicity.items()}
total = sum(weights.values()) or 1.0
shares = {mid: w / total for mid, w in weights.items()}
signals = []
for market in universe:
if market.id not in self.BASKET:
continue
base_edge = 30
signals.append(hz.Signal.increase(
market,
edge_bps=int(base_edge * shares[market.id] * len(self.BASKET)),
horizon="1h",
))
return signals
def _hawkes_for(self, mid): ... # your per-market Hawkes wiring
Why it works. The relative allocation tilts toward markets where your fills are cleaner (less likely to be adverse-selected), improving post-transaction-cost return even if raw alpha per market is identical.
The generic pattern
All five recipes share a structure:
- Read the flow store or run the toxicity estimator in background.
- Translate a flow observation into a size / direction / spread / allocation modifier.
- Apply that modifier to the signal you would have generated anyway.
Flow is a MODIFIER, not a REPLACEMENT for your base strategy. The module doesn’t generate alpha on its own. It reshapes how aggressively you express alpha you already have.
Backtesting flow-aware strategies
Two constraints:
- The flow store has to be populated for the backtest period. Either run the flow engine over recorded feed data first, then run the strategy against a populated store. Or run the flow engine inline with the backtest so events flow through both.
- The same data goes into detection AND strategy. Avoid look-ahead: the
MarketEventat timetcan update the flow store, but the strategy at timetshould only read findings withdetected_at <= t.hz.flow.anomalies(until=t)gets you that slice.
# Sketch: run detection and strategy on the same event stream,
# in lockstep, without look-ahead.
for ev in recorded_feed:
engine.ingest(ev) # findings emitted with ev.timestamp
if ev.event_kind == MarketEventKind.TradeTape:
# Strategy only sees findings up to ev.timestamp
recent = hz.flow.anomalies(
market_id=ev.market_id,
until=ev.timestamp,
)
strategy.on_flow(recent)
This is a strict requirement for a compliance-grade backtest. Documentation of the lockstep replay in your validation report is what makes the backtest’s P&L defensible.
Related
- Actor profiling. Understanding what a profile’s taxonomy distribution actually means.
- Policy reverse-engineering. Fitting shadow policies; IRL is v0.3.
- Defend recipes. The mirror image: flow as a defensive filter.
- Investigation recipes. Build the knowledge base these strategies consume.