Out-of-Sample
Train/test split testing, the simplest overfitting check
OutOfSample splits historical data into a training window (first X%) and a testing window (last 1-X%). The strategy runs on both and the results are compared.
Import
python
from horizon.validate import OutOfSample
Signature
python
OutOfSample(
train_pct: float = 0.7,
thresholds: dict[str, float] | None = None,
)
train_pctfloatFraction of history used for training. `0.7` = 70% train, 30% test.
Usage
python
from horizon.validate import OutOfSample
oos = OutOfSample(train_pct=0.7)
result = oos.run(
strategy=MyStrategy,
backtest=hz.BacktestConfig(
start="2020-01-01",
end="2024-12-31",
initial_cash_usd=100_000,
),
universe=my_universe,
asset_classes=[Equity],
)
print("In-sample metrics:")
print(f" Sharpe: {result.is_metrics['sharpe']:+.3f}")
print(f" Max DD: {result.is_metrics['max_drawdown']:.2%}")
print("Out-of-sample metrics:")
print(f" Sharpe: {result.oos_metrics['sharpe']:+.3f}")
print(f" Max DD: {result.oos_metrics['max_drawdown']:.2%}")
print("Degradation:")
for metric, ratio in result.degradation.items():
print(f" {metric}: OOS / IS = {ratio:.2%}")
Degradation ratio
python
degradation[metric] = oos_metrics[metric] / is_metrics[metric]
A ratio near 1.0 means the strategy generalizes well. A ratio under 0.5 means significant degradation: the in-sample result was at least partially overfit.
Thresholds
python
oos = OutOfSample(
train_pct=0.7,
thresholds={
"oos_sharpe_min": 0.5, # OOS Sharpe must exceed 0.5
"is_oos_sharpe_ratio_max": 2.0, # IS Sharpe can't be more than 2x OOS
},
)