Hierarchical Risk Parity (HRP)
de Prado's clustering-based portfolio construction: avoid the inverse-covariance instability
HRP is López de Prado’s alternative to classical mean-variance optimization. It builds portfolios from hierarchical clustering + recursive bisection + inverse-variance weighting. without ever inverting a covariance matrix. That single property makes it dramatically more robust than Markowitz on real financial data.
Why not mean-variance?
Classical mean-variance requires inverting the covariance matrix:
w* = (λ · Σ)⁻¹ · μ
The problem: Σ⁻¹ is numerically unstable when:
- The matrix is nearly singular (highly correlated assets)
- You have few observations relative to the number of assets (
T < Nis a catastrophe) - There are near-duplicate rows/columns
Small changes in estimated Σ cause huge changes in w*. Estimates shift from run to run. Turnover is enormous. Results are unreliable.
The HRP algorithm
Compute the correlation matrix
Convert correlations to distances
d(i,j) = √(0.5 × (1 - ρ(i,j))): highly correlated assets have small distance.Build a hierarchical cluster tree
Quasi-diagonalize the covariance
Recursive bisection
Final weights
Key properties
Using HRP in Horizon
HRP is on Horizon’s sizer roadmap (a future release). The cleanest path today is using the pypfopt library:
pip install horizon[opt]
import pandas as pd
from pypfopt import HierarchicalRiskParity
from horizon.portfolio.base import (
CashSnapshot,
CovarianceModel,
PortfolioConstraints,
Sizer,
)
from horizon.types import Signal, TargetPosition, Direction, Urgency
class HRPSizer:
"""Horizon Sizer using pypfopt's Hierarchical Risk Parity."""
name = "hrp"
def __init__(
self,
gross_notional: float = 100_000,
returns_history: pd.DataFrame | None = None,
):
self.gross_notional = gross_notional
self.returns_history = returns_history
def optimize(
self,
signals: list[Signal],
current_positions: dict,
cash: CashSnapshot,
cov: CovarianceModel | None,
constraints: PortfolioConstraints,
) -> list[TargetPosition]:
if len(signals) < 2 or self.returns_history is None:
return self._fallback_equal_weight(signals, cash)
# Extract a sub-DataFrame for the signal markets
market_ids = [s.market_id for s in signals]
try:
sub_returns = self.returns_history[market_ids].dropna()
except KeyError:
return self._fallback_equal_weight(signals, cash)
if len(sub_returns) < 20:
return self._fallback_equal_weight(signals, cash)
# Run HRP
hrp = HierarchicalRiskParity(returns=sub_returns)
weights = hrp.optimize()
# Convert to target positions
equity = max(cash.total_equity_usd, 1.0)
targets = []
for sig in signals:
w = weights.get(sig.market_id, 0.0)
direction_sign = 1 if sig.direction == Direction.Increase else -1
notional = w * equity * direction_sign
targets.append(TargetPosition(
market_id=sig.market_id,
target_notional_usd=notional,
urgency=sig.urgency,
reason=f"HRP w={w:.3f} [{sig.strategy_id}]",
contributing_signal_ids=(sig.strategy_id,),
))
return targets
def _fallback_equal_weight(self, signals, cash):
equity = max(cash.total_equity_usd, 1.0)
per_slot = equity / max(len(signals), 1)
return [
TargetPosition(
market_id=s.market_id,
target_notional_usd=per_slot * s.direction.sign,
urgency=s.urgency,
reason="HRP fallback (insufficient history)",
contributing_signal_ids=(s.strategy_id,),
)
for s in signals
]
Use it:
import horizon as hz
# Load historical returns from wherever
returns_df = pd.read_csv("historical_returns.csv", index_col="date", parse_dates=True)
result = hz.run(
mode="backtest",
portfolio=HRPSizer(returns_history=returns_df),
strategies=[...],
...
)
When to use HRP
HRP vs Kelly vs EqualWeight
| Property | HRP | Kelly | EqualWeight |
|---|---|---|---|
| Uses expected returns? | No | Yes | No |
| Matrix inversion? | No | Implicit via Kelly | No |
| Handles correlation? | Yes (clustering) | Yes (via Σ) | No |
| Turnover | Low | Medium-High | Very low |
| Needs sample history? | ~20 obs | ~100 obs | None |
| Stable? | Yes | Depends on Σ estimation | Trivially |
| Research standard? | de Prado | Economic theory | Baseline |
Pitfalls
Further reading
- “Building Diversified Portfolios that Outperform Out-of-Sample”: de Prado (2016), the original HRP paper
- “A Robust Estimator of the Efficient Frontier”: de Prado (2019), NCO (Nested Clustered Optimization)
- pypfopt documentation: practical implementation, including HRP + Critical Line Algorithm + Black-Litterman