Bootstrap Methods
11 bootstrap variants plus jackknife bias correction for non-parametric inference
Standard confidence intervals assume normality. Trading returns are skewed, fat-tailed, and autocorrelated — violating those assumptions. The bootstrap module provides 11 resampling variants that compute confidence intervals without distributional assumptions, plus jackknife bias correction for small-sample estimators.
API
Core estimators
python
# Mean with confidence interval
result = hz.bootstrap_mean(data, n=10000)
# result: BootstrapResult with .estimate, .ci_lower, .ci_upper, .std_error
# Sharpe ratio (accounts for autocorrelation in the ratio)
result = hz.bootstrap_sharpe(returns, n=10000)
# Value at Risk
result = hz.bootstrap_var(returns, n=10000, alpha=0.05)
# Correlation between two series
result = hz.bootstrap_correlation(x, y, n=10000)
Block bootstrap for dependent data
python
# Preserves autocorrelation structure by resampling contiguous blocks
result = hz.block_bootstrap(data, block_size=20, n=5000)
Hypothesis testing
python
# Non-parametric two-sample test: are these drawn from the same distribution?
result = hz.bootstrap_hypothesis_test(strategy_a_returns, strategy_b_returns, n=10000)
# result: HypothesisResult with .p_value, .test_statistic, .reject (at alpha=0.05)
Jackknife bias correction
python
bias = hz.jackknife_bias(data, statistic="sharpe")
# bias: float -- subtract from your estimate to debias
# Supported statistics: "mean", "variance", "sharpe", "sortino"
Is your Sharpe ratio real?
The most common question in strategy evaluation: is a Sharpe of 1.8 from 200 observations statistically meaningful? The bootstrap answers directly:
python
returns = [...] # 200 daily returns from backtest
result = hz.bootstrap_sharpe(returns, n=10000)
print(f"Sharpe: {result.estimate:.2f}")
print(f"95% CI: [{result.ci_lower:.2f}, {result.ci_upper:.2f}]")
# If CI includes 0, you can't reject the null of zero Sharpe
if result.ci_lower > 0:
print("Sharpe is statistically positive")
Comparing two strategies
python
test = hz.bootstrap_hypothesis_test(
strategy_a_returns,
strategy_b_returns,
n=10000,
)
if test.reject:
print(f"Strategies differ (p={test.p_value:.4f})")
else:
print("No significant difference -- don't overfit to the better backtest")
When to use
- Strategy validation: confidence intervals on Sharpe, Sortino, max drawdown without assuming normality.
- A/B testing: compare two strategy variants with a non-parametric test.
- Small samples: jackknife bias correction when you only have 50-100 observations.
- Dependent data: block bootstrap for returns with serial correlation or momentum effects.