Research
Bundled quant test taxonomy — 239 institutional-grade tests, queryable by stage, strategy, or category
horizon.research ships a 239-test reference library covering every
stage of systematic-strategy research, from idea through production
monitoring. Tests are annotated with purpose, formula, thresholds,
common pitfalls, references, and a cross-reference to the horizon.stats function that implements them
when one exists.
The data is the Quant Systematic Research Test Library (v4.0.0)
shipped as a JSON file under horizon/research/data/. Treat the
library as a reference, not a registry — you don’t extend it at
runtime. To add a test, edit the JSON and rebuild.
import horizon as hz
lib = hz.research.test_library()
print(lib)
# TestLibrary(version='4.0.0', tests=239, categories=21, stages=16)
Categories
21 categories. The five biggest:
| Category | Count | Examples |
|---|---|---|
lopez_de_prado_methods | 28 | Fractional differentiation, triple-barrier labeling, sample uniqueness, sequential bootstrap, PBO, DSR, CSCV |
catalyst_modeling | 26 | Event-driven signal validation |
adversarial | 20 | Random-signal control, sign-flip, lag scan, parameter perturbation |
stress_testing | 18 | Regime stress, tail-correlation, capacity stress |
predictive_power | 14 | IC (Spearman / Pearson / Kendall), IC IR, IC decay |
Full list via lib.categories().
Pipeline stages
The taxonomy assigns each test to one or more of these 16 stages:
stage_0_idea → stage_1_data_budget → stage_2_sniff
stage_3_data_infra → stage_4_signal → stage_4c_catalyst
stage_5_portfolio → stage_6_backtest → stage_7_robustness
stage_7b_stress_adversarial → stage_8_prereg → stage_9_holdout
stage_10_paper → stage_11_live_ramp → stage_12_production
Each stage has a StagePlan with required tests, recommended tests, and a one-line gate description. Get one via lib.by_stage(...).
Queries
Lookup by id
t = lib.get("jarque_bera")
print(t.pretty())
# === jarque_bera : Jarque-Bera joint normality ===
# category : distributional
# purpose : Joint test of skew = 0 and excess kurt = 0.
# formula : JB = T/6 * (b1^2 + (b2-3)^2/4) ~ chi2(2)
# applies to : strategies=all, stages=[6, 7]
# SDK function : horizon.stats.jarque_bera
# references :
# - Jarque & Bera (1980)
The Test is a frozen dataclass:
@dataclass(frozen=True)
class Test:
test_id: str
name: str
category: str
purpose: str
formula: str
implementation: str
thresholds: tuple[tuple[str, Any], ...]
pitfalls: tuple[str, ...]
references: tuple[str, ...]
applies_to_stages: tuple[int, ...]
applies_to_strategies: str | tuple[str, ...]
sdk_implementation: str | None
By pipeline stage
sp = lib.by_stage("stage_4_signal")
print(sp.pretty())
# === stage_4_signal ===
# required (8):
# - ic_spearman Information Coefficient (Spearman)
# - ic_information_ratio IC Information Ratio
# - ic_bootstrap_ci Stationary bootstrap CI on mean IC
# - ic_decay_curve IC decay curve
# - signal_autocorrelation Signal autocorrelation
# - fama_macbeth_regression Fama-MacBeth cross-sectional regression
# - look_ahead_check Look-ahead audit
# - information_bar_sampling Information-driven bars
# recommended (5): ...
# gate : CV mean IC and IR above thresholds; CI excludes zero
By strategy type
for t in lib.by_strategy("trend_following"):
print(t.test_id, t.name)
Includes all tests marked applies_to_strategies = "all" plus any
strategy-specific addendum tests.
By category
ldp = lib.by_category("lopez_de_prado_methods")
print(f"{len(ldp)} tests")
# 28 tests
Free-text search
for t in lib.search("autocorrel"):
print(t.test_id, t.name)
# signal_autocorrelation Signal autocorrelation
# ljung_box Ljung-Box autocorrelation test
# durbin_watson Durbin-Watson
# newey_west_t_stat Newey-West t-statistic
Gate criteria
g = lib.gate("gate_2_robustness")
print(g.pretty())
# === gate_2_robustness : Robustness battery gate ===
# stage : stage_7_robustness
# pass criteria (12):
# - DSR > 0.95 after trial-count adjustment
# - Haircut Sharpe > pre-stated benchmark
# - ...
# fail modes (5):
# - overfitting_to_parameters
# → Loopback to stage_4_signal with simpler signal
# - ...
Or look up by stage:
g = lib.gate_for_stage("stage_7_robustness")
# Returns the GateCriteria whose .stage matches, or None.
What’s already implemented
impl = lib.implemented()
for t in impl:
print(f"{t.test_id:35s} → {t.sdk_implementation}")
# bootstrap_sharpe_ci → horizon.stats.sharpe_ci
# bootstrap_sortino_ci → horizon.stats.sortino_ci
# jarque_bera → horizon.stats.jarque_bera
# ljung_box → horizon.stats.ljung_box
unimpl = lib.unimplemented()
# Useful for picking what to build next; 235 in v4
Putting it together — research workflow
Typical use: design the test plan for a new strategy by walking the stages.
import horizon as hz
lib = hz.research.test_library()
# What does a trend-following strategy need to pass at every stage?
for stage_id in lib.stages():
plan = lib.by_stage(stage_id)
print(f"\n{stage_id}: {len(plan.required)} required, "
f"{len(plan.recommended)} recommended")
for t in plan.required:
impl = t.sdk_implementation or "—"
print(f" {t.test_id:35s} ({impl})")
The output is a checklist you can work through stage by stage. Tests with an SDK function next to them can be run directly; tests without one are the ones where you need to write the implementation (using the formula + references in the test record).
Reference
horizon.research.test_library() → TestLibrary (cached)
horizon.research.default_library() → TestLibrary (cached, same instance)
horizon.research.Test → frozen dataclass
horizon.research.StagePlan → frozen dataclass
horizon.research.GateCriteria → frozen dataclass
horizon.research.DATA_VERSION → bundled JSON version
TestLibrary methods:
.version, .metadata
.categories(), .stages(), .strategies(), .gates()
.get(test_id)
.by_category(category) → tuple[Test]
.by_strategy(strategy) → tuple[Test] (includes 'all' + addendum)
.by_stage(stage_id) → StagePlan
.gate(gate_id) → GateCriteria
.gate_for_stage(stage_id) → GateCriteria | None
.search(query, limit=20) → tuple[Test]
.implemented() → tuple[Test] (has horizon.stats fn)
.unimplemented() → tuple[Test]