Data Providers
Pull real market data from Yahoo, Polygon, Alpaca — bars, options, news, fundamentals — plus CSV, DataFrame, and custom sources
Every provider is reachable as hz.data.<thing>. Jump to the one
you need:
How the providers are organized
Each provider has two forms:
- Free function —
hz.data.yahoo(...),hz.data.polygon(...),hz.data.alpaca(...). One-shot, returns bars. Best for the simple “give me a year of AAPL” call inside a backtest. - Class client —
hz.data.YahooData(),hz.data.PolygonData(),hz.data.AlpacaData(). Holds credentials once, exposes the full endpoint catalog (bars, quotes, trades, news, dividends, splits, fundamentals, options), accepts relative date strings ("1y","6mo","30d","now").
You can use either or both. The examples below show the class form first (richer) with the free function as a one-liner alternative.
Credentials
Three resolution paths, explicit > secrets > env:
import horizon as hz
from horizon.secrets import AwsSecretsManagerSecrets
# 1. Explicit kwarg — clearest for tests + notebooks
poly = hz.data.PolygonData(api_key="poly_xyz")
alp = hz.data.AlpacaData(api_key="PK...", api_secret="...")
# 2. Env var — default for local dev
# POLYGON_API_KEY=...
# ALPACA_API_KEY=... ALPACA_SECRET_KEY=...
poly = hz.data.PolygonData()
alp = hz.data.AlpacaData()
# 3. Secrets provider — production
poly = hz.data.PolygonData(secrets=AwsSecretsManagerSecrets(region="us-east-1"))
Yahoo needs no credentials. The free functions take the same api_key= / secret_key= kwargs:
hz.data.alpaca(["AAPL"], start="2024-01-01",
api_key="PK...", secret_key="...")
hz.data.polygon(["AAPL"], start="2024-01-01", api_key="poly_xyz")
Relative dates
Anywhere a date is expected, the class clients accept these forms:
| Form | Meaning |
|---|---|
"1y", "2y", "5y" | N years ago |
"1mo", "3mo", "6mo" | N months ago |
"30d", "90d" | N days ago |
"6h", "15m" | N hours / minutes ago |
"now" | current UTC time |
"2024-01-01" | absolute ISO date |
"2024-01-01T13:00:00Z" | absolute ISO datetime |
datetime(2024, 1, 1) | Python datetime |
yh.bars("AAPL", start="1y") # 1 year ago to now
poly.bars("AAPL", start="6mo", end="30d") # 6mo ago to 30d ago
hz.data.parse_when(...) exposes the parser directly if needed.
Yahoo
Free, no API key. Best for prototyping and cross-checks. Backed
by yfinance. Data quality is good enough for research; do not use
for live order routing.
pip install yfinance # one-time, no API key needed
Bars
import horizon as hz
# Class form (relative dates, full surface)
yh = hz.data.YahooData()
src = yh.bars(["AAPL", "MSFT"], start="1y", interval="1d")
# Or free function (same arguments shape as other providers)
src = hz.data.yahoo(["AAPL", "MSFT"], start="2023-01-01", end="2024-01-01")
# Intervals: "1m", "5m", "15m", "30m", "1h", "1d", "5d", "1wk", "1mo"
Company info
info = yh.info("AAPL")
# {'longName': 'Apple Inc.', 'sector': 'Technology',
# 'marketCap': 3e12, 'trailingPE': 30.5,
# 'dividendYield': 0.005, '52WeekHigh': 240, ... }
Corporate actions, options, news
divs = yh.dividends("AAPL") # [(date, cash_amount), ...]
splits = yh.splits("AAPL") # [(date, ratio), ...]
expiries = yh.options_expirations("AAPL")
chain = yh.options_chain("AAPL", expiry="2026-06-19")
# {"calls": [{contractSymbol, strike, bid, ask, impliedVolatility, ...}, ...],
# "puts": [...]}
news = yh.news("AAPL", limit=10)
earnings = yh.earnings_dates("AAPL")
recs = yh.recommendations("AAPL")
Polygon
Broadest catalog. Paid (free tier exists but is limited). Needs POLYGON_API_KEY env var or api_key= kwarg.
poly = hz.data.PolygonData(api_key="poly_xyz")
# or
poly = hz.data.PolygonData() # reads POLYGON_API_KEY
Bars
src = poly.bars(["AAPL", "MSFT"], start="1y", timeframe="1D")
# Or free function
src = hz.data.polygon(["AAPL"], start="2024-01-01")
# Timeframes: 1Min / 5Min / 15Min / 30Min / 1Hour / 4Hour /
# 1Day / 1Week / 1Month (also: 1m, 5m, 1h, 1D, etc.)
NBBO ticks (paid tier)
quotes = poly.quotes("AAPL", start="2024-01-02", end="2024-01-03", limit=10000)
trades = poly.trades("AAPL", start="2024-01-02", end="2024-01-03", limit=10000)
Reference data
details = poly.ticker_details("AAPL") # sector, mcap, employees, ...
divs = poly.dividends("AAPL")
splits = poly.splits("AAPL")
financials = poly.financials("AAPL", limit=10) # full 10-Q/10-K XBRL
news = poly.news("AAPL", limit=20)
status = poly.market_status()
prev = poly.previous_close("AAPL")
Options chain
chain = poly.options_chain(
"AAPL",
expiry="2026-06-19",
option_type="call",
strike_gte=180.0, strike_lte=200.0,
)
# One dict per contract: latest quote, last trade, Greeks, implied vol.
Alpaca
Stocks, options, crypto. Same ALPACA_API_KEY + ALPACA_SECRET_KEY your trading venue uses.
alp = hz.data.AlpacaData(api_key="PK...", api_secret="...")
# or
alp = hz.data.AlpacaData() # reads env vars
Stock bars
src = alp.bars(["AAPL", "MSFT"], start="1y", timeframe="1Day")
# Or free function
src = hz.data.alpaca(["AAPL"], start="2024-01-01")
# Timeframes: 1Min / 5Min / 15Min / 1Hour / 1Day
Latest top-of-book
quotes = alp.latest_quote(["AAPL", "MSFT"]) # NBBO snapshot
trades = alp.latest_trade(["AAPL", "MSFT"]) # last print
News
news = alp.news(["AAPL", "MSFT"], start="7d", limit=50)
Options chain (snapshot)
chain = alp.options_chain(
"AAPL",
expiry="2026-06-19",
option_type="call",
strike_gte=180.0, strike_lte=200.0,
)
# Returns one dict per contract: latestQuote, latestTrade, greeks,
# impliedVolatility (when the data plan provides them).
# Free-function form:
chain = hz.data.alpaca_options_chain("AAPL", expiry="2026-06-19")
Options bars (historical, per contract)
# Pick contracts from the chain, then pull their history
chain = alp.options_chain("AAPL", expiry="2025-06-20", option_type="call",
strike_gte=195.0, strike_lte=205.0)
symbols = [row["symbol"] for row in chain]
src = alp.options_bars(symbols, start="2025-01-01",
end="2025-06-20", timeframe="1Day")
for bar in src.iter_bars():
print(bar.market_id, bar.timestamp, bar.close)
# Returns standard Bar objects — pass straight into hz.run:
hz.run(
mode="backtest",
strategies=[MyOptionsStrategy],
asset_classes=[hz.AssetClass.Option],
universe=symbols,
data_source=src,
)
Provider paginates internally and chunks symbol lists into batches of 100, so the list can be arbitrarily long.
Crypto bars
btc = alp.crypto_bars("BTC/USD", start="30d", timeframe="1Hour")
Uses the separate /v1beta3/crypto/us endpoint family. Symbols use
the slash-pair convention.
Pandas integration
Every ProviderSource converts to pandas in one method:
src = poly.bars(["AAPL", "MSFT", "NVDA"], start="1y")
# Long form (one row per (symbol, timestamp))
df = src.to_pandas()
# columns: market_id, timestamp, open, high, low, close, volume
# Wide form (rows = timestamps, columns = symbols)
wide = src.to_pandas_wide(field="close")
returns = wide.pct_change().dropna()
print(returns.corr())
pandas is a soft dependency — comes with the [notebooks] or [research] extra.
CSV
data = hz.data.csv("history.csv")
# Custom column names
data = hz.data.csv("prices.csv",
market_col="ticker", date_col="timestamp",
price_col="adj_close", volume_col="vol",
)
Expected format:
market_id,date,open,high,low,close,volume
AAPL,2024-01-02,185.10,185.50,184.80,185.30,50000000
AAPL,2024-01-03,185.60,186.20,185.00,186.00,48000000
MSFT,2024-01-02,375.00,376.50,374.20,375.80,30000000
pandas DataFrame
import pandas as pd
df = pd.read_csv("my_data.csv")
data = hz.data.dataframe(df,
market_col="ticker", date_col="date", price_col="close",
)
hz.run(data_source=data, ...)
Plain dict
data = hz.data.from_dict({
"AAPL": [("2024-01-02", 185.0), ("2024-01-03", 186.5)],
"MSFT": [("2024-01-02", 375.0), ("2024-01-03", 378.2)],
})
Custom provider
Register your own data source:
from horizon.data import providers as dp
from horizon.data.base import Bar
@dp.register("my_database")
def fetch_from_db(tickers, start=None, end=None, **kwargs):
bars = []
for ticker in tickers:
for row in my_db.query(ticker, since=start):
bars.append(Bar(
market_id=ticker, timestamp=row["date"],
price=row["close"], open=row["open"],
high=row["high"], low=row["low"],
close=row["close"], volume=row["volume"],
))
return bars
data = dp.fetch("my_database", ["AAPL"], start="2023-01-01")
hz.run(data_source=data, ...)
Reference
hz.data.available_providers()
# ['alpaca', 'alpaca_options', 'polygon', 'yahoo']
| Provider | Asset classes | Credentials | Install |
|---|---|---|---|
| Yahoo | Equities, ETFs, crypto | None (free) | pip install yfinance |
| Polygon | Equities, options, crypto, forex | POLYGON_API_KEY | pip install requests |
| Alpaca stocks | Equities | ALPACA_API_KEY + ALPACA_SECRET_KEY | pip install requests |
| Alpaca options | Options snapshot + historical | same as above | same |
| Alpaca crypto | Crypto pairs | same as above | same |
| CSV / DataFrame / Dict / Custom | Any | None | Built-in |