Determinism

Same seed, same results, every time

Horizon’s backtest loop is deterministic given the same inputs. Running the same backtest twice produces bit-identical equity curves, trade counts, Sharpe ratios, and drawdown numbers.

Why determinism matters

Regression tests The test suite can assert exact equality of metrics. If a refactor changes the output, the tests fail and you know something broke.
Reproducible research When you share a backtest result, anyone running the same code with the same seed gets the same answer. No "works on my machine."
Hyperparameter search Compare parameter variants knowing that differences are from the parameters, not from random seed variance.
Debugging Reproduce a specific bug exactly by saving the seed and the config.

How it’s achieved

Seeded data sources

SyntheticGBM(seed=42) and SyntheticRegimes(seed=42) use random.Random(seed) internally. Two instances with the same seed produce bit-identical price paths.

No hidden RNG

The run loop, feature store, portfolio sizer, risk engine, and ledger contain no calls to random.random() or np.random.* without explicit seeding.

No wall-clock timestamps

Feature store updates use the bar's timestamp, not datetime.now(). Ledger fills use the bar's timestamp. No temporal drift from run to run.

Ordered iteration

Data sources yield bars in strict chronological order. Dict iteration order is stable (Python 3.7+). List ordering is preserved everywhere.

Verified

The test suite has explicit determinism tests:

python
# tests/test_behavioral_audit.py::TestDeterminism

def test_same_inputs_identical_metrics(self) -> None:
    def _run():
        return hz.run(
            mode="backtest",
            strategies=[TSMomentum(lookback=10, edge_bps=80)],
            asset_classes=[Equity],
            universe=_universe(["A", "B"]),
            data_source=_gbm(["A", "B"], n_bars=120, seed=42),
            portfolio=KellyOptimizer(kelly_fraction=0.25),
            risk=RiskProfile.moderate(),
            backtest=hz.BacktestConfig(initial_cash_usd=100_000),
        )

    r1 = _run()
    r2 = _run()
    assert r1.total_return == r2.total_return
    assert r1.max_drawdown == r2.max_drawdown
    assert r1.sharpe == r2.sharpe
    assert r1.n_trades == r2.n_trades

Same inputs → identical outputs. The test asserts equality, not approximate equality.

Determinism across strategies

Running multiple strategies simultaneously still produces deterministic results:

python
def test_determinism_across_multiple_strategies(self) -> None:
    def _run():
        return hz.run(
            mode="backtest",
            strategies=[
                TSMomentum(lookback=10),
                BollingerMeanRev(window=15, entry_z=2.0),
            ],
            asset_classes=[Equity],
            universe=_universe(["A", "B"]),
            data_source=_gbm(["A", "B"], n_bars=100, seed=7),
            portfolio=KellyOptimizer(kelly_fraction=0.2),
            backtest=hz.BacktestConfig(initial_cash_usd=100_000),
        )

    r1 = _run()
    r2 = _run()
    assert r1.equity_curve == r2.equity_curve
    assert len(r1.trades) == len(r2.trades)

Determinism in your code

To write a deterministic strategy yourself, avoid:

  • datetime.now(): use the ctx.now or the bar’s timestamp
  • random.random() without a seed: instantiate your own random.Random(seed)
  • numpy.random without a seed: use np.random.default_rng(seed)
  • dict ordering in Python before 3.7 (non-issue in modern Python)
  • Set iteration: convert to sorted lists before iterating
  • Hash-based ordering: dicts are insertion-ordered, but dict.items() order matters only if you stable-sort

Example: a strategy with internal randomness done right

python
import random

class RandomizedStrategy(Strategy):
    asset_classes = [Equity]
    features = {}

    def __init__(self, seed: int = 42):
        self._rng = random.Random(seed)

    def evaluate(self, f, universe):
        # Use the RNG deterministically
        r = self._rng.random()
        if r > 0.5:
            return [Signal.increase(m, edge_bps=30) for m in universe]
        return [Signal.decrease(m, edge_bps=30) for m in universe]

Two runs of this strategy with seed=42 produce the same sequence of random numbers and therefore the same signals.

Breaking determinism

Some things will break determinism if you’re not careful:

For single-threaded pure-Python backtests against seeded synthetic data, none of these apply, and determinism is absolute.

Hash-based determinism

To force Python’s hash to be deterministic across processes:

bash
PYTHONHASHSEED=0 python my_backtest.py

Or set PYTHONHASHSEED as an environment variable in your CI config. Useful when you’re comparing backtest output byte-for-byte across machines.

When determinism is NOT guaranteed

  • Live / paper modes: real data feeds can’t be replayed identically
  • Wall-clock-based features: anything using datetime.now() explicitly
  • External service calls. HTTP to a broker, DB queries, etc.
  • Parallel feature computation: if you add multiprocessing for speed, you lose determinism unless you carefully manage process ordering

Next