Bootstrap Methods

11 bootstrap variants plus jackknife bias correction for non-parametric inference

Standard confidence intervals assume normality. Trading returns are skewed, fat-tailed, and autocorrelated — violating those assumptions. The bootstrap module provides 11 resampling variants that compute confidence intervals without distributional assumptions, plus jackknife bias correction for small-sample estimators.

API

Core estimators

python
# Mean with confidence interval
result = hz.bootstrap_mean(data, n=10000)
# result: BootstrapResult with .estimate, .ci_lower, .ci_upper, .std_error

# Sharpe ratio (accounts for autocorrelation in the ratio)
result = hz.bootstrap_sharpe(returns, n=10000)

# Value at Risk
result = hz.bootstrap_var(returns, n=10000, alpha=0.05)

# Correlation between two series
result = hz.bootstrap_correlation(x, y, n=10000)

Block bootstrap for dependent data

python
# Preserves autocorrelation structure by resampling contiguous blocks
result = hz.block_bootstrap(data, block_size=20, n=5000)

Hypothesis testing

python
# Non-parametric two-sample test: are these drawn from the same distribution?
result = hz.bootstrap_hypothesis_test(strategy_a_returns, strategy_b_returns, n=10000)
# result: HypothesisResult with .p_value, .test_statistic, .reject (at alpha=0.05)

Jackknife bias correction

python
bias = hz.jackknife_bias(data, statistic="sharpe")
# bias: float -- subtract from your estimate to debias
# Supported statistics: "mean", "variance", "sharpe", "sortino"

Is your Sharpe ratio real?

The most common question in strategy evaluation: is a Sharpe of 1.8 from 200 observations statistically meaningful? The bootstrap answers directly:

python
returns = [...]  # 200 daily returns from backtest

result = hz.bootstrap_sharpe(returns, n=10000)
print(f"Sharpe: {result.estimate:.2f}")
print(f"95% CI: [{result.ci_lower:.2f}, {result.ci_upper:.2f}]")

# If CI includes 0, you can't reject the null of zero Sharpe
if result.ci_lower > 0:
    print("Sharpe is statistically positive")

Comparing two strategies

python
test = hz.bootstrap_hypothesis_test(
    strategy_a_returns,
    strategy_b_returns,
    n=10000,
)

if test.reject:
    print(f"Strategies differ (p={test.p_value:.4f})")
else:
    print("No significant difference -- don't overfit to the better backtest")

When to use

  • Strategy validation: confidence intervals on Sharpe, Sortino, max drawdown without assuming normality.
  • A/B testing: compare two strategy variants with a non-parametric test.
  • Small samples: jackknife bias correction when you only have 50-100 observations.
  • Dependent data: block bootstrap for returns with serial correlation or momentum effects.

Next