Hardware-acceleration for financial markets using FPGA
Increased revenue can be achieved using hardware implementations of financial algorithms instead of more traditional software implementations.
The arms race in high frequency and algorithmic trading continues
Banks, brokers and funds have continued to invest in faster software, lower latency networks and hardware located closer to trading venues. This includes using platforms such as FPGA, GPU, custom processors and microwave communications systems. These solutions have produced greater volumes of orders, trades and market data requiring trading venues and market data providers to continually upgrade their systems to keep up with this growth.
The next evolutionary phase will see elements of the software solution replaced with hardware
Hardware allows mathematical functions to be implemented at a low-level with an optimum customised parallel architecture, removing overhead from control, the operating system, interfaces and interrupts. The functionality is highly deterministic and repeatable delivering decisions orders of magnitude faster than software alternatives in some cases.
Hardware solutions have been proven in the telecoms industry and are now being adopted by leading edge high frequency trading firms. These solutions are also being assessed by trading venues and market data vendors who don’t want to be left behind in the arms race.
A hardware platform reduces latency over a software implementation through its parallel architecture
If an algorithm can be broken down into a pipeline of functions and/or parallel paths it can be accelerated using a hardware solution instead of software. This is due to multiple functional blocks being processed simultaneously on an optimised bespoke parallel platform without the constraints of a serial program counter or fixed processor architecture found in a software platform.
Hardware provides a more deterministic result than software, reducing functional risk and making testing easier
A well designed implementation of an algorithm in hardware is deterministic, i.e. it is predictable and repeatable, without the occurrence of random events within its processing paths. This makes it easier to test, because there are a finite number of operating states, and hence there is lower risk of functional errors. A software solution has a processing randomness due to the operating system and event driven interrupts which give a near-infinite number of path variations through the program flow which cannot all be tested.
Hardware delivers a predictable and repeatable processing latency
A combination of the parallel architecture and the deterministic nature of the hardware implementation mean that the processing time of the algorithm is predictable and less than that of the equivalent software implementation.
An example of acceleration: Improving the trade matching latency
The trading venue could consider implementing hardware solutions in the following ways:
- Parsing incoming and generating outgoing order/trade messages and market data messages using protocols such as the Financial Interface eXchange protocol (FIX) and FIX Adapted for Streaming (FAST) could benefit from parallelisation and deterministic processing.
- Proprietary APIs can be simplified and accelerated by keeping them on the same FPGA device, avoiding external interfaces with high-latency data exchange and indeterminate functionality.
- Pre-trade volume, price or collateral checks could benefit from hardware prioritisation and automation.
- The Matching Engine can be accelerated with a parallel architecture to perform the order matching function, aggregation calculations for best bid and offer volumes, bait generation and the management of complex order (e.g. icebergs).
Opportunities to increase trade volumes and reduce latency exist for trading firms and market data vendors too
The example given above highlights the functions within a trading venue that can benefit from hardware-acceleration. But other Capital Markets firms could benefit from Hardware acceleration too.
Trading firms perform similar message processing functions to the trading venue and additionally require solutions to process market data rapidly to support complex decision algorithms. All of the trading functions could be reviewed to identify if there are opportunities to accelerate performance.
The challenge faced by market data vendors is to collect, cleanse, enrich and disseminate the burgeoning array of market data that is generated by trading venues. All of these activities could be accelerated with hardware solutions.
PA can help clients achieve business advantage through hardware acceleration in a number of ways
To secure the benefits from trade volume growth, trading venues must analyse the expected volume growth and assess the bottlenecks in their current solutions.
A review of the financial system architecture will highlight where latency improvements can be made and identify bottlenecks. Latency can be removed by consolidating platforms to remove unnecessary interfaces and connections, by speeding up algorithms, by removing shared interfaces such as Ethernet/IP connections for example.
Benchmarking will identify where the most effective timing gains can be made: this might be through mathematical simulation of critical functions to assess how different architectures can improve the timing of key algorithms, for example, how a parallel implementation can speed things up.
Prototyping a key decision-making algorithm in FPGA will indicate the performance improvements that can be made over software.
To help you formulate and implement a strategy to deal with the increased order volume from hardware-accelerated solutions, please contact us now.