The evolution of algorithmic execution strategies.
From phone-traded blocks to reinforcement-learning execution: how three decades of microstructure change reshaped the way institutional capital actually moves.
From phone-traded blocks to electronic execution
Until roughly 2000, institutional execution was a relationship business. A portfolio manager phoned a sales-trader at a bank, the sales-trader negotiated a block price with the firm's principal desk, and the trade printed at a single price for the entire size. Implementation shortfall — a concept formalised by Perold in 1988 — existed in academic literature, but the practical alternative to the block price was the same firm offering to work the order over hours via desk discretion. There was little quantitative cost decomposition.
The shift to electronic markets and venue fragmentation in the 2000s broke this model. Decimalisation in US equities (2001), the rise of ECNs (Island, BRUT, Archipelago), Reg NMS (2007), and MiFID I in Europe (2007) created an environment with dozens of venues, displayed and dark, where the same instrument could quote different prices and depths simultaneously. Manual block negotiation could no longer access the full liquidity stack; algorithmic execution became operationally necessary, not optional.
By 2015 the institutional default was algorithmic execution for any order above a threshold, with manual desk-traded blocks reserved for genuinely illiquid names or strategically sensitive trades. Today, well over 80% of institutional US equity volume is executed by algorithm, and a similar share of FX flow above retail size. The history of institutional execution since 2000 is largely the history of the algorithms that absorbed it.
The first generation: TWAP and VWAP
The first widely deployed execution algorithms were the simplest possible — slice the order into equal pieces and execute them on a schedule. TWAP (time-weighted average price) divides the order into N equal child orders spaced uniformly across the execution window. VWAP (volume-weighted average price) divides the order into N child orders sized in proportion to the historical volume profile of the instrument, so that more is executed during high-volume periods.
Both algorithms are easy to implement, easy to explain, easy to benchmark. VWAP became the dominant institutional execution benchmark precisely because its target is observable: the volume-weighted average price the market itself produced over the window. A broker that beats VWAP added value; a broker that missed it underperformed. The benchmark created its own discipline.
The limitations are well-understood. TWAP and VWAP are schedule-based, not price-based. They execute on the planned slice regardless of current market conditions. A VWAP execution that happens to schedule a large slice into a sudden adverse price move pays full impact at the worst time. Static schedules also signal the order's existence — high-frequency observers can detect the regular slicing and trade ahead of the residual.
Despite these limitations, both remain in production use today. They are the right answer for orders where the priority is signal-minimisation over price-aware execution, and for benchmarking scenarios where the institutional client requires a transparent reference point. The institutional state of the art has moved past them, but they have not gone away.
Implementation shortfall and the Almgren-Chriss frontier
Implementation shortfall — the gap between the price at order arrival and the price actually achieved — is the canonical measure of execution cost. It captures both impact (the price moved against the order during execution) and opportunity cost (the price moved away while the order waited). Minimising one increases the other, and the right balance depends on the strategy's tolerance for risk.
The seminal contribution is Almgren-Chriss (2000–2001). They model the order's price impact as a square-root function of trade size relative to volume, decompose impact into permanent (information-bearing) and temporary (liquidity-consumption) components, and derive the closed-form optimal execution trajectory that minimises expected cost plus a risk-aversion term times execution-price variance. The result is a deterministic schedule that front-loads or back-loads the order depending on risk aversion, with intermediate cases producing the famous Almgren-Chriss curve.
The model's importance is not the specific schedule it produces but the framework it established. Execution became an explicit optimisation problem with a tractable cost function and a tunable risk-aversion parameter — the same paradigm portfolio construction had operated under for decades. Subsequent work (Obizhaeva-Wang, Gatheral, Almgren and successors) extended the framework to account for transient impact, stochastic volume profiles, and multi-asset coordination.
Almgren-Chriss-style implementation shortfall algorithms became the institutional default in equities through the 2010s. They remain the baseline against which more sophisticated algorithms are measured.
Liquidity-seeking and adaptive algorithms
Implementation shortfall in its original form is still schedule-based. Once the optimal trajectory is computed, the algorithm executes against it. Real markets do not cooperate with this. Liquidity arrives in bursts; the optimal trajectory at order arrival may be sub-optimal five minutes later as conditions change.
Liquidity-seeking algorithms abandoned the static schedule. Instead of executing predetermined slices on a clock, they execute opportunistically — posting passive orders when the spread implies favourable fill probability, lifting visible top-of-book when the displayed depth is large enough to absorb a slice without moving the mid, sending dark-pool pings when adverse-selection risk is contained. The order's planned schedule becomes an envelope rather than a script.
Adaptive implementation shortfall algorithms went a step further. The risk-aversion parameter, originally a static input, became dynamic: as the order's realised slippage tracked above the planned trajectory, the algorithm increased urgency to recoup ground; as it tracked below, it slowed down to capture continued favourable execution. The algorithm's behaviour conditioned on its own performance.
The empirical results are meaningful. Top-tier institutional brokers reported 1–4 basis points of average implementation-shortfall improvement from adaptive algorithms versus schedule-based predecessors. On compounded annual flow, that translates to seven- and eight-figure cost savings for large institutional clients, and to detectable Sharpe-ratio improvements for the strategies that depend on tight execution.
Cross-venue smart order routing
By 2010, US equities traded across 14 lit exchanges and over 30 dark pools, with materially different fee structures, queue priorities, and order-type sets. A single child order had to choose where to send. The naive answer — send everything to the venue with the tightest displayed quote — left substantial cost on the table by ignoring rebate structures, fill probabilities, and post-trade adverse selection.
Smart Order Routers (SORs) emerged to orchestrate this choice. A modern SOR routes a child order across multiple venues simultaneously based on a real-time scoring of each venue's expected fill quality, fee net of rebate, displayed depth, and historical adverse-selection profile. Posting passive orders favours maker-rebate venues; aggressive orders prefer venues with no taker fee and deep displayed liquidity; midpoint orders are routed to dark pools with low historical post-trade markout.
Dark-pool selection became an analytical sub-discipline within SOR. Each dark pool has a different mix of participants — retail wholesalers, hedge-fund flow, principal-trading firms, agency block traders. The mix determines toxicity: a pool dominated by informed flow will systematically fill aggressive orders just before adverse price moves. Institutional desks measure post-trade markouts at 100ms, 1s, 10s, and 1min on every fill from every pool, and SORs route to or away from pools based on that scoring.
Cross-venue execution moved execution research from a single-venue optimisation problem (how do I trade this order on this exchange) to a portfolio problem (how do I allocate child orders across a venue set with heterogeneous economics). The institutional discipline today is to treat venue routing as a first-class research domain, calibrate it monthly per instrument and per time-of-day, and audit it continuously against post-trade outcomes.
Reinforcement learning and modern execution
The most recent generation of execution algorithms abandons closed-form optimisation entirely. Reinforcement learning agents are trained on millions of historical executions, optimising directly for realised implementation shortfall against a state representation that includes order-book depth, recent price and volume dynamics, fill quality history, time remaining in the execution window, and a learned representation of the regime.
The action space typically includes order type (passive limit, aggressive limit, market, midpoint peg), child-order size, venue choice, and aggressiveness. The reward is the negative of realised implementation shortfall. Training uses historical replay augmented with simulated counterfactual fills — what would have happened if the agent had chosen differently in past states — to build out enough action-coverage data for the policy to generalise.
The strengths of RL execution are real. The agent adapts to non-stationary microstructure without requiring re-derivation of an optimal closed-form policy. It captures interactions — the optimal aggressiveness during a volatility spike conditional on having posted passively in the prior minute — that are not tractable in classical optimal-control frameworks. Top-tier published implementations report 0.5–1.5 bp of additional improvement over Almgren-Chriss-style adaptive baselines, on top of the gains those baselines already produced.
The challenges are equally real. RL execution requires extensive training data, ongoing monitoring for concept drift, and explainability tooling for regulated workflows. Several institutional brokers run RL execution as a default for liquid equities while keeping classical optimal-control as the fallback during regime stress. The mature institutional consensus is that RL is a refinement layer on top of well-understood classical execution, not a replacement for it.
Microstructure-aware execution across asset classes
Execution algorithms developed for US equities do not translate uncritically to other markets. Each asset class has microstructure features that change the optimal algorithm.
FX execution operates without a central exchange, with last-look quote rejection, prime-broker tiers, and ECN fragmentation. Algorithms that assume firm liquidity (as in equity execution) fail in FX. The institutional FX execution stack is built around per-counterparty toxicity scoring, last-look rejection prediction, and adaptive aggressiveness on liquidity-tier deltas. The classical equity-derived algorithms run as scaffolding; the FX-specific overlays are where the value lives.
Futures execution is closer to equities — single exchange, firm depth, transparent order books — but with smaller universes and lumpier liquidity. Time-of-day matters more (overnight Asian, pit-open, close auction) and contract-roll dynamics introduce execution complexity that has no equity analogue.
Crypto execution spans a wide range. Top-tier centralised exchanges (Binance, Coinbase, OKX) have equity-like microstructure with deep books and predictable depth; second-tier exchanges have order-book quirks (rebate gaming, frequent depth withdrawal) that classical algorithms misprice; on-chain execution is its own discipline with MEV, slippage curves, and sandwich-attack risk.
The general principle: execution is per-instrument, per-venue, per-time-of-day. Universal algorithms produce mediocre results everywhere. Specialist execution stacks calibrated to the specific microstructure of the instrument they trade are how top-tier institutional desks extract the additional basis points that compound into meaningful PnL.
The economic value of execution quality
How much does execution actually matter? The empirical answer is consistent across published institutional studies: top-decile execution outperforms median execution by 5–15 basis points per institutional order, depending on instrument liquidity, order size, and market conditions. Compounded across the trade frequency of an institutional manager, this translates into materially different live-trading PnL relative to backtest.
For a quantitative strategy with a paper-alpha of 30 bps per trade and a turnover of 200% per year, the difference between top-decile and median execution is roughly 0.4–0.6 of additional Sharpe per year. For most strategies, this is the difference between a track record that compounds and a track record that does not. Execution is rarely cited in pitch decks as a source of alpha, and is virtually always cited in retrospective analyses of why strategies failed in production.
The strategic implication for an institutional manager is that execution should be treated as a research domain rather than a procurement decision. The choice of broker, the configuration of execution algorithms, the monthly calibration of impact and slippage models, the post-trade audit of toxicity and markouts — these are alpha decisions, not infrastructure decisions. Firms that internalise this correctly build execution research teams alongside signal research teams. Firms that treat execution as commodity outsource an order-of-magnitude amount of their realised alpha to the brokers and venues they happen to default to.
Across our four live strategies, execution is calibrated per-instrument and per-venue with monthly recalibration of the impact function from realised fills. Backtest cost assumptions are tuned conservatively to over-state slippage relative to live experience, so that strategies surviving backtest validation reliably exceed expectations in live execution rather than under-perform them. Execution quality is one of the components of the Sharpe ratio our research team optimises, alongside signal quality, portfolio construction, and risk management.
Discuss this with the research desk.
If your team is working on related problems — risk architecture, portfolio construction, signal research — we are open to a briefing. Institutional and professional partners only.
Request Briefing