Olympus
System Design

Data Persistence

How trades, orders, and klines are persisted — RocksDB event sourcing for crash recovery, Postgres read store for queryable history.

Two-Tier Persistence

Olympus uses two persistence layers, each serving a different purpose:

LayerTechnologyPurposeLatency
Event logRocksDB (embedded)Crash recovery via deterministic replayMicroseconds
Read storePostgres (external)Queryable trade/order/kline historyMilliseconds

The matching engine's hot path writes only to RocksDB. The Postgres read store is populated asynchronously by a background writer task and is not on the critical path.

Data Flow

Engine thread
  |
  +-- broadcast<TickMarketData>              (public feeds, no PII)
  |     +-- MarketDataPublisher > WebSocket  (strips identifying data)
  |     +-- KlineAggregator > candles
  |           +-- broadcast<Arc<str>> > WebSocket (in-progress + closed)
  |           +-- broadcast<Kline> > PersistenceWriter (closed only)
  |
  +-- broadcast<TickPersistenceData>         (internal, full identifying data)
        +-- PersistenceWriter
              +-- trades    > INSERT INTO trades
              +-- orders    > INSERT ... ON CONFLICT DO UPDATE orders
              +-- klines    > INSERT ... ON CONFLICT DO UPDATE klines

Data privacy boundary (FCA)

The external WebSocket feeds and the internal persistence store receive data through separate broadcast channels with different types. This is enforced at compile time — the market data publisher cannot access TickPersistenceData because it has no receiver for that channel.

ChannelContains PIIConsumers
TickMarketDataNo — trades stripped to {symbol, price, qty, time}MarketDataPublisher, KlineAggregator
TickPersistenceDataYes — full buyer_account, seller_account, order IDsPersistenceWriter only

The Postgres store is only accessible through authenticated API endpoints scoped to the requesting user's own account.

RocksDB Event Log

The primary persistence mechanism is an append-only transaction log in RocksDB:

  • Every processed tick is serialized and appended to CF_SEQUENCER_LOG
  • Key: u64 sequence number (big-endian for lexicographic ordering)
  • Value: serialized Tick containing all transactions
  • Engine snapshots saved to CF_SNAPSHOTS every 1000 ticks

On crash recovery: load latest snapshot, replay all ticks after it. See Architecture — Crash Recovery.

Postgres Read Store

Schema

Three tables optimized for the query patterns of the Binance-compatible API:

trades — Every executed trade with full identifying data:

id, sequence, instrument_id, price, quantity,
buyer_account, seller_account, buyer_order_id, seller_order_id,
timestamp_ms, created_at

Indexed on buyer_account, seller_account, instrument_id (all with descending timestamp), and sequence.

orders — Order lifecycle tracking:

order_id (UUID PK), account_id, instrument_id, side, order_type,
price, quantity, status, filled_qty, sequence, timestamp_ms, updated_at

Status updated via ON CONFLICT (order_id) DO UPDATE on each fill/cancel event. Indexed on account_id, instrument_id, and (account_id, status).

klines — Closed OHLCV candles:

id, instrument_id, interval, open_time, close_time,
open, high, low, close, volume, trade_count

Unique index on (instrument_id, interval, open_time) for upsert support. Intervals: 1m, 5m, 15m, 1h, 4h, 1d.

Writer Task

The persistence writer (crates/api/src/persistence.rs) runs as a tokio task:

  • Subscribes to broadcast<TickPersistenceData> for trades and order events
  • Subscribes to broadcast<Kline> for closed candles from the kline aggregator
  • Buffers events in memory and flushes to Postgres every 100ms or 500 rows (whichever comes first)
  • Uses sqlx::QueryBuilder for multi-row batch inserts
  • All three table inserts run in a single Postgres transaction per flush
  • Handles Lagged errors by logging a warning and continuing (gaps can be backfilled from RocksDB)
  • Caps buffers at 10,000 rows to prevent unbounded memory growth

Engine Mode Compatibility

The writer is mode-agnostic. Both batch mode and continuous mode emit to the same broadcast channels:

  • Batch mode: events arrive once per tick (~1ms), bundled
  • Continuous mode: events arrive per-order as trades are produced

The writer's internal batching absorbs this difference.

Performance Impact

The engine thread gains one broadcast::send() call (~5ns ring buffer write). All Postgres I/O runs on the tokio runtime, on separate threads from the engine. With CPU pinning enabled (OLYMPUS_ENGINE_CORE), the engine core is fully isolated.

If Postgres is slow or unavailable, the engine and WebSocket feeds are completely unaffected. The writer retries, and gaps can be backfilled from the transaction log.

API Query Endpoints

Three Binance-compatible endpoints query the Postgres read store:

EndpointAuthDescription
GET /api/v1/myTrades?symbol=ETH-USDAccountAuthTrade history for the authenticated user
GET /api/v1/allOrders?symbol=ETH-USDAccountAuthOrder history for the authenticated user
GET /api/v1/klines?symbol=ETH-USD&interval=1hPublicOHLCV candlestick data

All support startTime, endTime, and limit (default 500, max 1000) query parameters. All return 503 if DATABASE_URL is not configured.

See the API Reference for full request/response schemas.

Configuration

The read store is enabled by setting the DATABASE_URL environment variable:

DATABASE_URL=postgres://user:pass@host:5432/olympus_chain

If unset, the application starts without the read store — history endpoints return 503 and no writer task is spawned. Migrations run automatically on startup via sqlx::migrate!.

On this page