Rust & CEX Engineering — OKX Interview Preparation

Last updated: 2026-03-29


Todo:

  • Tokyo
  • Borrow checker
  • async/await (Future Tait, method)
  • Smart pointers, Box Arc, RefCell
  • mutex, guards, RvLock
  • Send and Sync Trait - и что он дает при пересылке в другой thread
  • lifetimes
  • Impl, dyn,
  • [ ]

Resources

Rust Fundamentals

Practice & Challenges

  • Exercism Rust Track — 100+ exercises with mentored feedback
  • LeetCode — algorithm bank with full Rust support
  • Advent of Code — annual puzzles, huge Rust community, great for benchmarking idiomatic solutions
  • CodeCrafters — build Redis, Git, HTTP server from scratch in Rust
  • Rust Quiz — tricky edge-case questions by dtolnay (author of serde/syn)

Order Book & Matching Engine

Algorithms & Data Structures


Table of Contents


1) Role & what OKX is looking for

Target role: Staff/Senior Rust Software Engineer at OKX

What OKX explicitly asks for

  • Deep Rust expertise — language internals, ecosystem, toolchain
  • Cross-platform FFI bindings and system integration
  • Performance tuning in high-throughput / low-latency systems
  • Preferred: prior experience in trading systems or fintech platforms
  • Blockchain / cryptographic protocol experience (bonus)
  • Open-source Rust contributions (bonus)
  • Mentorship and architectural ownership at scale

Why Rust at a CEX

Deterministic latency without GC pauses + memory safety enforced at compile time. The borrow checker eliminates data races and use-after-free — exactly the bugs that cause production incidents in financial systems.

One production Rust matching engine: P50 tick-to-trade = 4.2μs, P99 = 7.8μs — matching legacy C++ baselines.


2) What the role is in practice

This is a systems engineering role, not application CRUD. OKX wants engineers who can:

  1. Build or extend a matching engine (ordering, fills, state)
  2. Design market data pipelines (WebSocket fan-out, sequence gaps)
  3. Write risk engine components (pre-trade checks, position counters)
  4. Reason about latency at microsecond level
  5. Own correctness across concurrent systems

Behavioral framing for senior-level questions:

“I approach exchange systems as four layers: order gateway, matching core, market data distribution, and risk. Each has different latency and correctness requirements. In Rust this maps naturally: single-threaded matching core fed by MPSC channels, async I/O at the gateway, lock-free atomics in risk counters, broadcast channels for market data fan-out.”


3) Four layers of a CEX you must speak to

LayerWhat it doesRust approach
Order GatewayReceives orders (REST/WebSocket/FIX), validates, routesTokio async, session management, FIX protocol
Matching EngineMatches buy/sell orders, produces fillsSingle-threaded, MPSC input, BTreeMap order book
Market DataPublishes book updates + trades to subscribersTokio broadcast channel, sequence numbers, snapshot + delta
Risk EnginePre-trade checks (position limits, balance), post-trade settlementAtomicI64 counters, RwLock account state, <1μs target

4) Rust fundamentals refresh

Ownership & Borrowing

  • Single owner at a time; borrow checker enforces at compile time
  • &T shared reference (many readers), &mut T exclusive (one writer)
  • Move semantics: assignment transfers ownership unless Copy
  • Lifetimes: compiler proves references don’t outlive data

Exchange relevance: explicit ownership makes shared-state bugs impossible to compile. No GC = no latency spikes.

Rc vs Arc

  • Rc<T> — reference counted, single-thread only
  • Arc<T> — atomic reference counted, safe across threads (Send + Sync)
  • Arc<Mutex<T>> for shared mutable state across threads

Send and Sync

  • Send: type can be moved to another thread
  • Sync: type can be shared by reference across threads
  • Tokio’s spawn requires futures to be Send (work-stealing scheduler)

Async / Tokio channels

// Market data fan-out — N subscribers each get every message
let (tx, _) = tokio::sync::broadcast::channel(1024);
 
// Order flow into matching engine — single consumer
let (order_tx, mut order_rx) = tokio::sync::mpsc::channel(256);
 
// BBO — single writer, many readers, only latest value
let (bbo_tx, bbo_rx) = tokio::sync::watch::channel(initial_bbo);
 
// Race order response vs timeout
tokio::select! {
    result = order_future => handle_result(result),
    _ = tokio::time::sleep(timeout) => handle_timeout(),
}
ChannelUse case
mpscOrder flow into matching engine (many senders, one receiver)
broadcastMarket data to N subscribers
watchBBO / last price (single writer, many readers, only latest)
oneshotRequest/response pattern

Concurrency gotchas

  • False sharing: two AtomicI64 on same 64-byte cache line → cache invalidation storms → use #[repr(align(64))]
  • Ordering correctness: Relaxed for independent counters, AcqRel for producer/consumer pairs, SeqCst for total ordering (expensive)
  • Mutex vs RwLock: RwLock for read-heavy account state; Mutex when writes are frequent
  • crossbeam::channel over std::sync::mpsc in hot paths

Error handling

// Domain errors — callers can match on variants
#[derive(thiserror::Error, Debug)]
pub enum OrderError {
    #[error("insufficient balance: need {need}, have {have}")]
    InsufficientBalance { need: u64, have: u64 },
    #[error("invalid price: {0}")]
    InvalidPrice(Decimal),
}
 
// Never in production hot paths — panics
some_option.unwrap()

Performance patterns

  • Arena allocation: bumpalo — allocate at startup, reset between cycles, zero per-order heap alloc
  • Avoid Box<T> in hot paths — prefer fixed-size stack structs
  • Zero-copy serde: rkyv or flatbuffers for wire format
  • Huge pages: 2MB pages reduce TLB misses ~99%
  • Profiling: cargo-flamegraph, criterion for benchmarks, perf for CPU counters

5) Exchange domain knowledge

Order book

use std::collections::{BTreeMap, VecDeque};
 
struct OrderBook {
    bids: BTreeMap<Price, VecDeque<Order>>,  // sorted descending
    asks: BTreeMap<Price, VecDeque<Order>>,  // sorted ascending
}
// Best bid = bids.iter().next_back()
// Best ask = asks.iter().next()
  • Insert/delete: O(log n)
  • Lock-free variant: per-price-level atomics, no global lock

Matching engine — sequencer pattern

Client A ──┐
Client B ──┼──► mpsc::channel ──► Single-threaded matcher ──► Fill events
Client C ──┘                                                    └──► broadcast
  • All orders enter through one MPSC channel
  • Matching core is single-threaded — no locks needed inside
  • Deterministic: same input → same output (required for audit/replay)

Order types

TypeBehaviour
LimitRest on book at specified price
MarketFill immediately at best available price
IOC (Immediate or Cancel)Fill what’s available, cancel remainder
FOK (Fill or Kill)Fill entire qty or cancel entirely
Post-OnlyReject if would take liquidity
Stop / Stop-LimitTriggered when price crosses level

Market data — sequence gaps

  1. Subscribe to WebSocket delta stream
  2. Receive snapshot with seq=1000
  3. Apply deltas with seq > 1000 in order
  4. If gap detected → re-fetch snapshot, buffer out-of-order deltas
  5. Each update format: {seq, bids: [[price, qty]], asks: [[price, qty]]}

Risk engine

struct RiskEngine {
    positions: DashMap<AssetId, AtomicI64>,           // lock-free reads
    balances: Arc<RwLock<HashMap<UserId, Balance>>>,  // read-heavy
}
 
impl RiskEngine {
    fn pre_trade_check(&self, order: &Order) -> Result<(), RiskError> {
        // Target: <1μs — no heap alloc, no I/O
        let pos = self.positions
            .get(&order.asset)
            .map(|p| p.load(Ordering::Acquire))
            .unwrap_or(0);
        if pos + order.qty > MAX_POSITION {
            return Err(RiskError::PositionLimitExceeded);
        }
        Ok(())
    }
}

FIX protocol

  • Industry standard for institutional order flow
  • Logon → Heartbeat → New Order Single → Execution Report
  • Session-level: sequence numbers, resend request on gap
  • Rust crates: fix-rs, quickfix-rs

WebSocket server pattern

let (market_tx, _) = broadcast::channel::<MarketUpdate>(4096);
 
async fn handle_ws_client(
    stream: WebSocketStream<TcpStream>,
    mut market_rx: broadcast::Receiver<MarketUpdate>,
) {
    loop {
        tokio::select! {
            update = market_rx.recv() => { /* send to client */ }
            msg = stream.next() => { /* handle subscribe/unsubscribe */ }
        }
    }
}

6) Key crates to know

CratePurpose
tokioAsync runtime — the standard for network services
tokio-tungsteniteWebSocket server/client on Tokio
serde / serde_jsonSerialization — REST API payloads
rkyv / flatbuffersZero-copy serialization for hot paths
crossbeamBetter channels + epoch-based GC for lock-free structures
dashmapConcurrent HashMap without global lock
parking_lotFaster Mutex/RwLock than std
bumpaloArena allocator — zero heap alloc per order
thiserrorTyped error enums for domain code
anyhowError propagation in application/binary code
criterionMicro-benchmarking
tracingStructured async-aware logging
prometheusMetrics export
rust_decimalExact decimal arithmetic for prices/quantities
axumREST API server (Tokio-native)

7) System design questions

Design a limit order book

  1. Data structure: BTreeMap<Price, VecDeque<Order>> per side
  2. Operations: insert O(log n), cancel O(log n), match O(1) at top of book
  3. Concurrency: single-threaded core fed by MPSC — no locks inside matcher
  4. Persistence: WAL (write-ahead log) on every order event for crash recovery
  5. Market data: emit delta on every mutation with sequence number

Design a matching engine handling 100k orders/sec

  1. Sequencer pattern: MPSC channel as the serialization point
  2. Pre-allocate order pool at startup (arena allocator)
  3. Avoid heap allocation in hot path
  4. Single thread pinned to isolated CPU core
  5. Async I/O on separate Tokio threads
  6. Benchmark with criterion, profile with cargo-flamegraph

Handle WebSocket sequence gaps

Subscribe → receive REST snapshot (seq=N) → apply deltas with seq > N → on gap: re-fetch snapshot → resume. Buffer out-of-order deltas during snapshot fetch, discard those with seq snapshot seq.

Sub-microsecond risk check

AtomicI64 per position, compare_exchange for CAS-based reservation. No heap alloc. No I/O. Inline the hot path. Pin thread to core. Use Ordering::AcqRel for producer/consumer pairs.


8) Rust interview Q&A

Q: What is the borrow checker and why does it matter for trading systems? The borrow checker enforces at compile time that references don’t outlive data and mutable access is exclusive. For trading systems: data races are impossible to compile — no corrupted order state, no use-after-free. No GC = no latency spikes.

Q: When do you use Arc<Mutex<T>> vs channels? Arc<Mutex<T>> for state multiple components read/write (account balances). Channels for one-directional message passing where the receiver owns state exclusively — prefer channels in the matching engine core because the sequencer pattern eliminates lock contention entirely.

Q: Explain Send and Sync. When does Tokio care? Send: value can be moved to another thread. Sync: &T can be shared across threads. Tokio’s spawn requires futures to be Send (work-stealing scheduler may run them on any thread). Rc<T> is not Send — use Arc<T> across threads.

Q: async fn vs spawn_blocking — when? async fn for I/O-bound work (network, disk). spawn_blocking for CPU-bound work that would block the async executor (heavy crypto, order replay). Blocking the executor starves other tasks.

Q: How do you avoid allocations in a hot path? Pre-allocate a pool at startup with bumpalo or a fixed-size ring buffer. Reuse memory per-order. Use value types on the stack. Avoid Box<T>, Vec growth, String in the matching loop. Verify with cargo-flamegraph.

Q: What is false sharing and how do you fix it? Two fields on the same 64-byte cache line cause cache invalidation storms when written from different threads. Fix with #[repr(align(64))] to force each field to its own cache line.

Q: thiserror vs anyhow? thiserror for domain/library code — typed errors callers can match on. anyhow for application code — ergonomic ? propagation without defining types. Domain layers use thiserror, the binary’s main uses anyhow.

Q: parking_lot::Mutex vs std::sync::Mutex? parking_lot is faster (no poisoning, smaller, better contention handling). Use in performance-sensitive code. std::sync::Mutex in public library APIs for compatibility.

Q: How does Tokio’s work-stealing scheduler work? Each worker thread has a local task queue. When idle, it steals from other threads. Tasks are polled when their Future is woken (I/O ready, timer fired, channel message). Never block the executor with CPU-heavy work.


9) Exchange domain Q&A

Q: What’s the difference between IOC and FOK? IOC fills what’s available at the limit price and cancels the remainder. FOK requires the entire quantity to be filled immediately or the whole order is cancelled — no partial fills.

Q: How does price-time priority work? Orders at the same price level are matched in arrival order (FIFO). Better price always takes priority over time. BTreeMap<Price, VecDeque<Order>> — the deque gives FIFO within a price level.

Q: What is a mark price and why does it matter? An index-derived fair value price used for liquidations and unrealized PnL, resistant to manipulation on any single venue. OKX uses a composite index from multiple sources. Critical for perpetual futures — liquidation price is based on mark, not last trade.

Q: How do you handle duplicate orders? Client sends a client_order_id (idempotency key). Gateway deduplicates in a short-lived LRU cache keyed by (user_id, client_order_id). If seen, return the original result without re-processing.

Q: What is a circuit breaker in exchange context? Automatic halt when price moves beyond a threshold in a short window (e.g., >5% in 1 minute). Prevents cascading liquidations. Exchange-level: halt matching. Product-level: widen spread, reject aggressive orders.

Q: What is a WAL and why do matching engines need it? Write-Ahead Log — every order event is durably written to disk before being applied to in-memory state. On crash, replay the WAL to reconstruct exact state. Required for audit trails and deterministic crash recovery.


10) Build project idea — mini matching engine

What to build

A limit order book matching engine with:

  • REST API to submit/cancel orders (axum)
  • WebSocket market data feed with sequence numbers (tokio-tungstenite)
  • MPSC sequencer pattern — single-threaded matching core
  • Pre-trade risk check (balance/position limits)
  • Structured logging (tracing) + metrics (prometheus)
  • criterion benchmarks showing throughput

Step-by-step

StepWhat it proves
OrderBook struct with BTreeMapCore data structure knowledge
Matching logic (limit + market + IOC)Domain understanding
MPSC sequencer + Tokio runtimeAsync + concurrency patterns
Axum REST endpoints (submit/cancel/status)API layer
WS feed with sequence numbersMarket data pipeline
AtomicI64 risk checksLow-latency systems thinking
criterion benchmarksPerformance engineering mindset

Repo structure

mini-exchange/
  src/
    order_book.rs   # BTreeMap-based book + matching
    risk.rs         # AtomicI64 position checks
    gateway.rs      # Axum REST handlers
    market_data.rs  # broadcast channel + WS server
    main.rs         # Tokio runtime, wiring
  benches/
    matching.rs     # criterion benchmark

11) Crash plan — 7 days

DayFocusGoal
1Rust refresh — ownership, lifetimes, traits, error handlingCan explain borrow checker fluently
2Async Rust — Tokio, channels (mpsc/broadcast/watch), select!Can design async pipeline from memory
3Concurrency — atomics, Mutex vs RwLock, false sharing, crossbeamCan reason about data races and cache behaviour
4Order book — implement BTreeMap-based book + matching logicWorking code to walk through in interview
5CEX domain — order types, FIX, risk engine, market data gapsCan answer all domain Q&A above
6System design — matching engine 100k/s, WS fan-out, risk checksCan whiteboard any of the 4 designs cold
7Mock interview — explain every section aloud, timedComfortable pace, no hesitation on Rust fundamentals

12) Must-know shortlist

If you have 1 day, know these cold:

Rust

  • Ownership / borrowing / lifetimes in one sentence each
  • Arc<Mutex<T>> vs channels — when to use which
  • Send + Sync — what they mean and why Tokio needs them
  • async fn vs spawn_blocking
  • thiserror vs anyhow
  • Atomic OrderingRelaxed vs AcqRel vs SeqCst
  • False sharing — what it is and how to fix it
  • parking_lot vs std::sync::Mutex

Exchange domain

  • Order book data structure + matching algorithm
  • Sequencer/MPSC pattern for matching engine
  • IOC vs FOK vs Post-Only
  • Price-time priority (FIFO within price level)
  • WebSocket delta stream + sequence gap recovery
  • Pre-trade risk check design (AtomicI64, <1μs)
  • Mark price purpose (liquidations, PnL)
  • WAL for crash recovery + audit trail

Crates to name-drop

  • tokio, axum, tokio-tungstenite
  • crossbeam, dashmap, parking_lot
  • thiserror, anyhow, tracing
  • criterion, bumpalo, rust_decimal