Research

Bundled quant test taxonomy — 239 institutional-grade tests, queryable by stage, strategy, or category

horizon.research ships a 239-test reference library covering every stage of systematic-strategy research, from idea through production monitoring. Tests are annotated with purpose, formula, thresholds, common pitfalls, references, and a cross-reference to the horizon.stats function that implements them when one exists.

The data is the Quant Systematic Research Test Library (v4.0.0) shipped as a JSON file under horizon/research/data/. Treat the library as a reference, not a registry — you don’t extend it at runtime. To add a test, edit the JSON and rebuild.

python
import horizon as hz

lib = hz.research.test_library()
print(lib)
# TestLibrary(version='4.0.0', tests=239, categories=21, stages=16)

Categories

21 categories. The five biggest:

CategoryCountExamples
lopez_de_prado_methods28Fractional differentiation, triple-barrier labeling, sample uniqueness, sequential bootstrap, PBO, DSR, CSCV
catalyst_modeling26Event-driven signal validation
adversarial20Random-signal control, sign-flip, lag scan, parameter perturbation
stress_testing18Regime stress, tail-correlation, capacity stress
predictive_power14IC (Spearman / Pearson / Kendall), IC IR, IC decay

Full list via lib.categories().

Pipeline stages

The taxonomy assigns each test to one or more of these 16 stages:

stage_0_idea           → stage_1_data_budget   → stage_2_sniff
stage_3_data_infra     → stage_4_signal        → stage_4c_catalyst
stage_5_portfolio      → stage_6_backtest      → stage_7_robustness
stage_7b_stress_adversarial → stage_8_prereg   → stage_9_holdout
stage_10_paper         → stage_11_live_ramp    → stage_12_production

Each stage has a StagePlan with required tests, recommended tests, and a one-line gate description. Get one via lib.by_stage(...).

Queries

Lookup by id

python
t = lib.get("jarque_bera")
print(t.pretty())
# === jarque_bera : Jarque-Bera joint normality ===
#   category        : distributional
#   purpose         : Joint test of skew = 0 and excess kurt = 0.
#   formula         : JB = T/6 * (b1^2 + (b2-3)^2/4) ~ chi2(2)
#   applies to      : strategies=all, stages=[6, 7]
#   SDK function    : horizon.stats.jarque_bera
#   references      :
#       - Jarque & Bera (1980)

The Test is a frozen dataclass:

python
@dataclass(frozen=True)
class Test:
    test_id: str
    name: str
    category: str
    purpose: str
    formula: str
    implementation: str
    thresholds: tuple[tuple[str, Any], ...]
    pitfalls: tuple[str, ...]
    references: tuple[str, ...]
    applies_to_stages: tuple[int, ...]
    applies_to_strategies: str | tuple[str, ...]
    sdk_implementation: str | None

By pipeline stage

python
sp = lib.by_stage("stage_4_signal")
print(sp.pretty())
# === stage_4_signal ===
#   required (8):
#     - ic_spearman              Information Coefficient (Spearman)
#     - ic_information_ratio     IC Information Ratio
#     - ic_bootstrap_ci          Stationary bootstrap CI on mean IC
#     - ic_decay_curve           IC decay curve
#     - signal_autocorrelation   Signal autocorrelation
#     - fama_macbeth_regression  Fama-MacBeth cross-sectional regression
#     - look_ahead_check         Look-ahead audit
#     - information_bar_sampling Information-driven bars
#   recommended (5): ...
#   gate            : CV mean IC and IR above thresholds; CI excludes zero

By strategy type

python
for t in lib.by_strategy("trend_following"):
    print(t.test_id, t.name)

Includes all tests marked applies_to_strategies = "all" plus any strategy-specific addendum tests.

By category

python
ldp = lib.by_category("lopez_de_prado_methods")
print(f"{len(ldp)} tests")
# 28 tests

Free-text search

python
for t in lib.search("autocorrel"):
    print(t.test_id, t.name)
# signal_autocorrelation   Signal autocorrelation
# ljung_box                Ljung-Box autocorrelation test
# durbin_watson            Durbin-Watson
# newey_west_t_stat        Newey-West t-statistic

Gate criteria

python
g = lib.gate("gate_2_robustness")
print(g.pretty())
# === gate_2_robustness : Robustness battery gate ===
#   stage           : stage_7_robustness
#   pass criteria   (12):
#     - DSR > 0.95 after trial-count adjustment
#     - Haircut Sharpe > pre-stated benchmark
#     - ...
#   fail modes      (5):
#     - overfitting_to_parameters
#         → Loopback to stage_4_signal with simpler signal
#     - ...

Or look up by stage:

python
g = lib.gate_for_stage("stage_7_robustness")
# Returns the GateCriteria whose .stage matches, or None.

What’s already implemented

python
impl = lib.implemented()
for t in impl:
    print(f"{t.test_id:35s} → {t.sdk_implementation}")
# bootstrap_sharpe_ci    → horizon.stats.sharpe_ci
# bootstrap_sortino_ci   → horizon.stats.sortino_ci
# jarque_bera            → horizon.stats.jarque_bera
# ljung_box              → horizon.stats.ljung_box

unimpl = lib.unimplemented()
# Useful for picking what to build next; 235 in v4

Putting it together — research workflow

Typical use: design the test plan for a new strategy by walking the stages.

python
import horizon as hz

lib = hz.research.test_library()

# What does a trend-following strategy need to pass at every stage?
for stage_id in lib.stages():
    plan = lib.by_stage(stage_id)
    print(f"\n{stage_id}: {len(plan.required)} required, "
          f"{len(plan.recommended)} recommended")
    for t in plan.required:
        impl = t.sdk_implementation or "—"
        print(f"  {t.test_id:35s}  ({impl})")

The output is a checklist you can work through stage by stage. Tests with an SDK function next to them can be run directly; tests without one are the ones where you need to write the implementation (using the formula + references in the test record).

Reference

horizon.research.test_library()          → TestLibrary  (cached)
horizon.research.default_library()       → TestLibrary  (cached, same instance)
horizon.research.Test                    → frozen dataclass
horizon.research.StagePlan               → frozen dataclass
horizon.research.GateCriteria            → frozen dataclass
horizon.research.DATA_VERSION            → bundled JSON version

TestLibrary methods:

.version, .metadata
.categories(), .stages(), .strategies(), .gates()
.get(test_id)
.by_category(category) → tuple[Test]
.by_strategy(strategy) → tuple[Test]   (includes 'all' + addendum)
.by_stage(stage_id)    → StagePlan
.gate(gate_id)         → GateCriteria
.gate_for_stage(stage_id) → GateCriteria | None
.search(query, limit=20) → tuple[Test]
.implemented()         → tuple[Test]   (has horizon.stats fn)
.unimplemented()       → tuple[Test]