Retention and destruction

Policy-driven archival + destruction with cryptographic receipts. Rule 204-2 / 17a-4.

SEC Rule 204-2 requires RIAs to retain trading records for at least 5 years; Rule 17a-4 for broker-dealers is 6 years with WORM storage. Beyond the required period, compliance usually wants to destroy records on a scheduled basis with a provable receipt.

horizon.audit.retention is the policy engine. It ingests a RetentionPolicy, finds events older than the cutoff, excludes those under a LegalHold, archives the rest to a cold sink, destroys them from the live sink, and appends a signed receipt to the destruction log.

Read-only queries on the live sink never know these modules exist. The destruction path is an explicit, operator-invoked action.

Import

python
from horizon.audit import (
    DestructionLog,
    DestructionReceipt,
    LegalHold,
    RetentionEnforcer,
    RetentionPolicy,
)

Policy

python
policy = RetentionPolicy(
    retention_years=5,            # or retention_days=
    legal_holds=(
        LegalHold(
            reason="SEC subpoena 2026-03-14",
            account_id="acc_jane",
        ),
        LegalHold(
            reason="preserve_all_orders_pending_audit",
            category="order.filled",
        ),
    ),
)

LegalHold fields are filters. An event is held when at least one filter matches. Holds stack: an event matching any hold is excluded from destruction.

Supported filters:

  • account_id
  • client_id
  • market_id
  • category (the audit category value, e.g. "order.filled")
  • event_id

An empty LegalHold(reason=...) matches nothing. This is deliberate: no accidental “hold everything” when a field is forgotten.

Enforcer

python
from horizon.audit import SQLiteSink, DestructionLog, RetentionEnforcer

source = SQLiteSink("/var/lib/horizon/audit.db")
archive = SQLiteSink("/var/lib/horizon/audit_archive.db")
destruction_log = DestructionLog("/var/lib/horizon/destruction.jsonl")

enforcer = RetentionEnforcer(policy)
result = enforcer.archive_and_destroy(
    source=source,
    archive=archive,
    destruction_log=destruction_log,
    operator="ops@firm.com",
    reason="annual_retention_enforcement_2026",
    dry_run=False,
)

result.eligible_count, held_count, archived_count, destroyed_count let the operator review before and after. held_reasons is a dict mapping each hold’s reason string to the count of matching events.

CLI

horizon audit enforce-retention \
    --db /var/lib/horizon/audit.db \
    --archive /var/lib/horizon/audit_archive.db \
    --destruction-log /var/lib/horizon/destruction.jsonl \
    --years 5 --operator ops@firm --reason annual_retention_2026 \
    [--dry-run]

Use --dry-run first. It reports what would happen without writing.

Receipts

Every destruction writes a DestructionReceipt to the destruction log. A receipt is a JSON line with:

FieldMeaning
destroyed_atISO timestamp
operatorwho ran the enforcement
reasonfree-form string
countnumber of rows removed
first_sequence, last_sequenceaffected sequence range
range_hashSHA-256 of the concatenated hashes of destroyed events
cutoffpolicy cutoff at the time of destruction
policyretention_years, retention_days, n_legal_holds

The range_hash is what proves what was destroyed, even after the rows are gone. Any later dispute over whether a specific historical event existed at the time of destruction can be resolved by recomputing the chain up to last_sequence and matching against range_hash.

Read the destruction log

python
from horizon.audit import DestructionLog

dl = DestructionLog("/var/lib/horizon/destruction.jsonl")
for entry in dl.read_all():
    print(
        f"{entry['destroyed_at']}  operator={entry['operator']}  "
        f"count={entry['count']}  seq={entry['first_sequence']}..{entry['last_sequence']}"
    )

Archive sink

The archive is a separate SQLiteSink file that accepts write(event) calls. Sequences and hashes are preserved as they were in the source, so the archive is itself chain-verifiable.

For true cold storage, copy the archive file onto an S3 bucket with Object Lock after each run. The live archive sink is a convenient staging area.

Legal holds over time

Holds are policy data, not archive data. Compose them in code, a config file, or a compliance database. A hold does not block read access: audit replays and reports still include held events; only destruction skips them.

When a hold lifts, the next enforcement run picks those events up. The destruction_log records them with the lift date baked into the run’s reason field.

What this does not do

  • Encrypt archive contents. Do that at the filesystem / object-store level (S3-managed KMS, per-row envelope encryption, or GPG before upload).
  • Compress archive. The sink is row-storage; if storage is a concern, compress the archive file before uploading to cold storage.
  • Track who read what. Audit-log reads are out of scope. If read-side auditing matters, wrap SQLiteSink.read_range in an access-logging proxy.
  • Auto-destroy. By design: destruction is always explicit. The CLI prints a summary before writing, and --dry-run is supported.

Related