Retention and destruction
Policy-driven archival + destruction with cryptographic receipts. Rule 204-2 / 17a-4.
SEC Rule 204-2 requires RIAs to retain trading records for at least 5 years; Rule 17a-4 for broker-dealers is 6 years with WORM storage. Beyond the required period, compliance usually wants to destroy records on a scheduled basis with a provable receipt.
horizon.audit.retention is the policy engine. It ingests a RetentionPolicy, finds events older than the cutoff, excludes those under a LegalHold, archives the rest to a cold sink, destroys them from the live sink, and appends a signed receipt to the destruction log.
Read-only queries on the live sink never know these modules exist. The destruction path is an explicit, operator-invoked action.
Import
from horizon.audit import (
DestructionLog,
DestructionReceipt,
LegalHold,
RetentionEnforcer,
RetentionPolicy,
)
Policy
policy = RetentionPolicy(
retention_years=5, # or retention_days=
legal_holds=(
LegalHold(
reason="SEC subpoena 2026-03-14",
account_id="acc_jane",
),
LegalHold(
reason="preserve_all_orders_pending_audit",
category="order.filled",
),
),
)
LegalHold fields are filters. An event is held when at least one filter matches. Holds stack: an event matching any hold is excluded from destruction.
Supported filters:
account_idclient_idmarket_idcategory(the audit category value, e.g."order.filled")event_id
An empty LegalHold(reason=...) matches nothing. This is deliberate: no accidental “hold everything” when a field is forgotten.
Enforcer
from horizon.audit import SQLiteSink, DestructionLog, RetentionEnforcer
source = SQLiteSink("/var/lib/horizon/audit.db")
archive = SQLiteSink("/var/lib/horizon/audit_archive.db")
destruction_log = DestructionLog("/var/lib/horizon/destruction.jsonl")
enforcer = RetentionEnforcer(policy)
result = enforcer.archive_and_destroy(
source=source,
archive=archive,
destruction_log=destruction_log,
operator="ops@firm.com",
reason="annual_retention_enforcement_2026",
dry_run=False,
)
result.eligible_count, held_count, archived_count, destroyed_count let the operator review before and after. held_reasons is a dict mapping each hold’s reason string to the count of matching events.
CLI
horizon audit enforce-retention \
--db /var/lib/horizon/audit.db \
--archive /var/lib/horizon/audit_archive.db \
--destruction-log /var/lib/horizon/destruction.jsonl \
--years 5 --operator ops@firm --reason annual_retention_2026 \
[--dry-run]
Use --dry-run first. It reports what would happen without writing.
Receipts
Every destruction writes a DestructionReceipt to the destruction log. A receipt is a JSON line with:
| Field | Meaning |
|---|---|
destroyed_at | ISO timestamp |
operator | who ran the enforcement |
reason | free-form string |
count | number of rows removed |
first_sequence, last_sequence | affected sequence range |
range_hash | SHA-256 of the concatenated hashes of destroyed events |
cutoff | policy cutoff at the time of destruction |
policy | retention_years, retention_days, n_legal_holds |
The range_hash is what proves what was destroyed, even after the rows are gone. Any later dispute over whether a specific historical event existed at the time of destruction can be resolved by recomputing the chain up to last_sequence and matching against range_hash.
Read the destruction log
from horizon.audit import DestructionLog
dl = DestructionLog("/var/lib/horizon/destruction.jsonl")
for entry in dl.read_all():
print(
f"{entry['destroyed_at']} operator={entry['operator']} "
f"count={entry['count']} seq={entry['first_sequence']}..{entry['last_sequence']}"
)
Archive sink
The archive is a separate SQLiteSink file that accepts write(event) calls. Sequences and hashes are preserved as they were in the source, so the archive is itself chain-verifiable.
For true cold storage, copy the archive file onto an S3 bucket with Object Lock after each run. The live archive sink is a convenient staging area.
Legal holds over time
Holds are policy data, not archive data. Compose them in code, a config file, or a compliance database. A hold does not block read access: audit replays and reports still include held events; only destruction skips them.
When a hold lifts, the next enforcement run picks those events up. The destruction_log records them with the lift date baked into the run’s reason field.
What this does not do
- Encrypt archive contents. Do that at the filesystem / object-store level (S3-managed KMS, per-row envelope encryption, or GPG before upload).
- Compress archive. The sink is row-storage; if storage is a concern, compress the archive file before uploading to cold storage.
- Track who read what. Audit-log reads are out of scope. If read-side auditing matters, wrap
SQLiteSink.read_rangein an access-logging proxy. - Auto-destroy. By design: destruction is always explicit. The CLI prints a summary before writing, and
--dry-runis supported.
Related
- Audit trail for the event model.
- Audit CLI for all subcommands.