Elastic Net Selection
Lasso, ridge, and elastic net for feature selection and signal construction
When you have many candidate features and need to determine which ones actually predict returns, elastic net regression provides regularized feature selection. It produces sparse models where irrelevant features get zero weight.
API
Elastic net
model = hz.elastic_net_fit(X, y, alpha=0.1, l1_ratio=0.5)
# X: list of lists (n_samples x n_features)
# y: list of floats (n_samples)
# alpha: regularization strength (higher = more sparse)
# l1_ratio: 0.0 = pure ridge, 1.0 = pure lasso, 0.5 = balanced
print(model.coefficients) # list of floats, one per feature
print(model.intercept) # float
print(model.nonzero_count) # number of features with nonzero weight
predictions = model.predict(X_new)
Lasso (L1 only)
model = hz.lasso_fit(X, y, alpha=0.1)
# Equivalent to elastic_net_fit with l1_ratio=1.0
# Produces the sparsest models -- aggressive feature elimination
Ridge (L2 only)
model = hz.ridge_fit(X, y, alpha=0.1)
# Equivalent to elastic_net_fit with l1_ratio=0.0
# Shrinks coefficients but doesn't zero them out
# Use when all features are relevant but you need regularization
Feature selection workflow
Find which features predict next-day returns from a pool of candidates:
import numpy as np
feature_names = ["momentum_20", "vol_60", "rsi_14", "spread_z",
"vpin", "flow_imbalance", "sentiment", "funding_rate"]
# X: historical feature values, y: next-day returns
model = hz.elastic_net_fit(X_train, y_train, alpha=0.05, l1_ratio=0.7)
# Which features survived?
selected = []
for name, coef in zip(feature_names, model.coefficients):
if abs(coef) > 1e-8:
selected.append((name, coef))
print(f" {name}: {coef:+.6f}")
print(f"\n{len(selected)} of {len(feature_names)} features selected")
Constructing a composite signal
Once you know which features matter, the fitted coefficients give you a linear signal:
model = hz.elastic_net_fit(X_train, y_train, alpha=0.05, l1_ratio=0.5)
# In your strategy's evaluate():
def evaluate(self, f, universe):
for m in universe:
features = [f.momentum[m.id], f.vol[m.id], f.rsi[m.id],
f.spread_z[m.id], f.vpin[m.id], f.flow[m.id],
f.sentiment[m.id], f.funding[m.id]]
score = model.predict([features])[0]
# score > 0: predicted positive return, score < 0: predicted negative
When to use
- Feature discovery: you have 20+ candidate features and want to know which 5 actually matter.
- Signal construction: combine multiple weak predictors into a single composite score.
- Overfitting control: regularization prevents fitting to noise when your training sample is short.
Higher alpha means more aggressive pruning. Run with several values (0.001 to 1.0) and check out-of-sample performance to find the right tradeoff. For purely nonlinear relationships, tree-based methods will outperform, but for the linear component of return prediction this is a reliable starting point.