Data Providers

Pull real market data from Yahoo, Polygon, Alpaca — bars, options, news, fundamentals — plus CSV, DataFrame, and custom sources

Every provider is reachable as hz.data.<thing>. Jump to the one you need:

How the providers are organized

Each provider has two forms:

  • Free functionhz.data.yahoo(...), hz.data.polygon(...), hz.data.alpaca(...). One-shot, returns bars. Best for the simple “give me a year of AAPL” call inside a backtest.
  • Class clienthz.data.YahooData(), hz.data.PolygonData(), hz.data.AlpacaData(). Holds credentials once, exposes the full endpoint catalog (bars, quotes, trades, news, dividends, splits, fundamentals, options), accepts relative date strings ("1y", "6mo", "30d", "now").

You can use either or both. The examples below show the class form first (richer) with the free function as a one-liner alternative.

Credentials

Three resolution paths, explicit > secrets > env:

python
import horizon as hz
from horizon.secrets import AwsSecretsManagerSecrets

# 1. Explicit kwarg — clearest for tests + notebooks
poly = hz.data.PolygonData(api_key="poly_xyz")
alp  = hz.data.AlpacaData(api_key="PK...", api_secret="...")

# 2. Env var — default for local dev
#    POLYGON_API_KEY=...
#    ALPACA_API_KEY=...  ALPACA_SECRET_KEY=...
poly = hz.data.PolygonData()
alp  = hz.data.AlpacaData()

# 3. Secrets provider — production
poly = hz.data.PolygonData(secrets=AwsSecretsManagerSecrets(region="us-east-1"))

Yahoo needs no credentials. The free functions take the same api_key= / secret_key= kwargs:

python
hz.data.alpaca(["AAPL"], start="2024-01-01",
               api_key="PK...", secret_key="...")
hz.data.polygon(["AAPL"], start="2024-01-01", api_key="poly_xyz")

Relative dates

Anywhere a date is expected, the class clients accept these forms:

FormMeaning
"1y", "2y", "5y"N years ago
"1mo", "3mo", "6mo"N months ago
"30d", "90d"N days ago
"6h", "15m"N hours / minutes ago
"now"current UTC time
"2024-01-01"absolute ISO date
"2024-01-01T13:00:00Z"absolute ISO datetime
datetime(2024, 1, 1)Python datetime
python
yh.bars("AAPL", start="1y")               # 1 year ago to now
poly.bars("AAPL", start="6mo", end="30d") # 6mo ago to 30d ago

hz.data.parse_when(...) exposes the parser directly if needed.


Yahoo

Free, no API key. Best for prototyping and cross-checks. Backed by yfinance. Data quality is good enough for research; do not use for live order routing.

bash
pip install yfinance   # one-time, no API key needed

Bars

python
import horizon as hz

# Class form (relative dates, full surface)
yh = hz.data.YahooData()
src = yh.bars(["AAPL", "MSFT"], start="1y", interval="1d")

# Or free function (same arguments shape as other providers)
src = hz.data.yahoo(["AAPL", "MSFT"], start="2023-01-01", end="2024-01-01")

# Intervals: "1m", "5m", "15m", "30m", "1h", "1d", "5d", "1wk", "1mo"

Company info

python
info = yh.info("AAPL")
# {'longName': 'Apple Inc.', 'sector': 'Technology',
#  'marketCap': 3e12, 'trailingPE': 30.5,
#  'dividendYield': 0.005, '52WeekHigh': 240, ... }

Corporate actions, options, news

python
divs   = yh.dividends("AAPL")              # [(date, cash_amount), ...]
splits = yh.splits("AAPL")                  # [(date, ratio), ...]

expiries = yh.options_expirations("AAPL")
chain    = yh.options_chain("AAPL", expiry="2026-06-19")
# {"calls": [{contractSymbol, strike, bid, ask, impliedVolatility, ...}, ...],
#  "puts":  [...]}

news     = yh.news("AAPL", limit=10)
earnings = yh.earnings_dates("AAPL")
recs     = yh.recommendations("AAPL")

Polygon

Broadest catalog. Paid (free tier exists but is limited). Needs POLYGON_API_KEY env var or api_key= kwarg.

python
poly = hz.data.PolygonData(api_key="poly_xyz")
# or
poly = hz.data.PolygonData()    # reads POLYGON_API_KEY

Bars

python
src = poly.bars(["AAPL", "MSFT"], start="1y", timeframe="1D")

# Or free function
src = hz.data.polygon(["AAPL"], start="2024-01-01")

# Timeframes: 1Min / 5Min / 15Min / 30Min / 1Hour / 4Hour /
#             1Day / 1Week / 1Month  (also: 1m, 5m, 1h, 1D, etc.)

NBBO ticks (paid tier)

python
quotes = poly.quotes("AAPL", start="2024-01-02", end="2024-01-03", limit=10000)
trades = poly.trades("AAPL", start="2024-01-02", end="2024-01-03", limit=10000)

Reference data

python
details    = poly.ticker_details("AAPL")        # sector, mcap, employees, ...
divs       = poly.dividends("AAPL")
splits     = poly.splits("AAPL")
financials = poly.financials("AAPL", limit=10)  # full 10-Q/10-K XBRL
news       = poly.news("AAPL", limit=20)
status     = poly.market_status()
prev       = poly.previous_close("AAPL")

Options chain

python
chain = poly.options_chain(
    "AAPL",
    expiry="2026-06-19",
    option_type="call",
    strike_gte=180.0, strike_lte=200.0,
)
# One dict per contract: latest quote, last trade, Greeks, implied vol.

Alpaca

Stocks, options, crypto. Same ALPACA_API_KEY + ALPACA_SECRET_KEY your trading venue uses.

python
alp = hz.data.AlpacaData(api_key="PK...", api_secret="...")
# or
alp = hz.data.AlpacaData()     # reads env vars

Stock bars

python
src = alp.bars(["AAPL", "MSFT"], start="1y", timeframe="1Day")

# Or free function
src = hz.data.alpaca(["AAPL"], start="2024-01-01")

# Timeframes: 1Min / 5Min / 15Min / 1Hour / 1Day

Latest top-of-book

python
quotes = alp.latest_quote(["AAPL", "MSFT"])     # NBBO snapshot
trades = alp.latest_trade(["AAPL", "MSFT"])     # last print

News

python
news = alp.news(["AAPL", "MSFT"], start="7d", limit=50)

Options chain (snapshot)

python
chain = alp.options_chain(
    "AAPL",
    expiry="2026-06-19",
    option_type="call",
    strike_gte=180.0, strike_lte=200.0,
)
# Returns one dict per contract: latestQuote, latestTrade, greeks,
# impliedVolatility (when the data plan provides them).

# Free-function form:
chain = hz.data.alpaca_options_chain("AAPL", expiry="2026-06-19")

Options bars (historical, per contract)

python
# Pick contracts from the chain, then pull their history
chain   = alp.options_chain("AAPL", expiry="2025-06-20", option_type="call",
                            strike_gte=195.0, strike_lte=205.0)
symbols = [row["symbol"] for row in chain]

src = alp.options_bars(symbols, start="2025-01-01",
                       end="2025-06-20", timeframe="1Day")
for bar in src.iter_bars():
    print(bar.market_id, bar.timestamp, bar.close)

# Returns standard Bar objects — pass straight into hz.run:
hz.run(
    mode="backtest",
    strategies=[MyOptionsStrategy],
    asset_classes=[hz.AssetClass.Option],
    universe=symbols,
    data_source=src,
)

Provider paginates internally and chunks symbol lists into batches of 100, so the list can be arbitrarily long.

Crypto bars

python
btc = alp.crypto_bars("BTC/USD", start="30d", timeframe="1Hour")

Uses the separate /v1beta3/crypto/us endpoint family. Symbols use the slash-pair convention.


Pandas integration

Every ProviderSource converts to pandas in one method:

python
src = poly.bars(["AAPL", "MSFT", "NVDA"], start="1y")

# Long form (one row per (symbol, timestamp))
df = src.to_pandas()
# columns: market_id, timestamp, open, high, low, close, volume

# Wide form (rows = timestamps, columns = symbols)
wide = src.to_pandas_wide(field="close")
returns = wide.pct_change().dropna()
print(returns.corr())

pandas is a soft dependency — comes with the [notebooks] or [research] extra.


CSV

python
data = hz.data.csv("history.csv")

# Custom column names
data = hz.data.csv("prices.csv",
    market_col="ticker", date_col="timestamp",
    price_col="adj_close", volume_col="vol",
)

Expected format:

csv
market_id,date,open,high,low,close,volume
AAPL,2024-01-02,185.10,185.50,184.80,185.30,50000000
AAPL,2024-01-03,185.60,186.20,185.00,186.00,48000000
MSFT,2024-01-02,375.00,376.50,374.20,375.80,30000000

pandas DataFrame

python
import pandas as pd

df = pd.read_csv("my_data.csv")
data = hz.data.dataframe(df,
    market_col="ticker", date_col="date", price_col="close",
)
hz.run(data_source=data, ...)

Plain dict

python
data = hz.data.from_dict({
    "AAPL": [("2024-01-02", 185.0), ("2024-01-03", 186.5)],
    "MSFT": [("2024-01-02", 375.0), ("2024-01-03", 378.2)],
})

Custom provider

Register your own data source:

python
from horizon.data import providers as dp
from horizon.data.base import Bar

@dp.register("my_database")
def fetch_from_db(tickers, start=None, end=None, **kwargs):
    bars = []
    for ticker in tickers:
        for row in my_db.query(ticker, since=start):
            bars.append(Bar(
                market_id=ticker, timestamp=row["date"],
                price=row["close"], open=row["open"],
                high=row["high"], low=row["low"],
                close=row["close"], volume=row["volume"],
            ))
    return bars

data = dp.fetch("my_database", ["AAPL"], start="2023-01-01")
hz.run(data_source=data, ...)

Reference

python
hz.data.available_providers()
# ['alpaca', 'alpaca_options', 'polygon', 'yahoo']
ProviderAsset classesCredentialsInstall
YahooEquities, ETFs, cryptoNone (free)pip install yfinance
PolygonEquities, options, crypto, forexPOLYGON_API_KEYpip install requests
Alpaca stocksEquitiesALPACA_API_KEY + ALPACA_SECRET_KEYpip install requests
Alpaca optionsOptions snapshot + historicalsame as abovesame
Alpaca cryptoCrypto pairssame as abovesame
CSV / DataFrame / Dict / CustomAnyNoneBuilt-in

Next