A next-generation trading system that delivers faster performance
I. Development of the Electronic Trading System
Increasing requirements on core technologies of asset trading reflected the rapid growth of the global financial industry during the first half of the 20th century. In the 50s, buyers and sellers trade by negotiation, and ask prices were recorded manually on paper. Against the backdrop of diverse securities types and rising trading volume, this way to deal with quotes gradually created a paperwork crisis during the 60s-70s due to its inefficiency and high cost. The New York Stock Exchange (NYSE) had no choice but to suspend trading every Wednesday and cut hours in other trading days to limit its activity. With their unrivaled capability to process a huge number of transactions simultaneously, computers started to come into play. A paperless process, or electronic revolution, was a crucial turning point in global financial history. Transactions have migrated to electronic trading platforms, offering quicker and cheaper operations without time or geographical barriers.
Electronic trading systems have emerged worldwide, including State Street’s Currenex, HKEX’s INET, ICAP’s EBS Spot Ai and LIFFE’s LIFE CONNECT. Since crypto assets exist only in electronic form, they are naturally associated with electronic trading platforms, but the requirements for crypto trading and traditional trading systems are slightly different. Overall, a crypto trading system should possess the following characteristics:
a. Low latency and high throughput
Latency and throughput are the key indicators to measure the performance of a trading system. Our prime objective is to achieve low latency and high throughput when designing a trading system.
In the context of trading, latency refers to a time interval between a request received by and a response made by a trading system. The surge of high-frequency trading volume, to a large extent, drives the market’s demand for low latency. To enable high-frequency traders to cross-trade on crypto exchanges, their trading systems should be equipped with low latency trading engines to quickly handle orders and reflect market realities in the highly competitive crypto market.
Throughput is the amount of requests or events that a trading system can process within a second. Throughput can directly impact trading efficiency, so that crypto trading systems should be designed to withstand extreme scenarios and utilize processing units.
b. Maintainability and scalability
Compared to traditional assets, crypto prices are more volatile and vulnerable to global shocks. As crypto trading systems continuously handle requests 24/7, they are designed to undergo as little offline maintenance as possible. In addition, it is obvious that the crypto sector undergoes a rapid transformation because different digital derivatives services as varied as margin, futures and options trading have been rolled out only in a decade since its rise. The proliferation of innovative services has raised the requirements for the maintainability and scalability of crypto trading systems.
II. OKEx Lightning System 2.0: Lightspeed Performance
As one of the top global digital asset exchanges, OKEx serves tens of thousands of users with its comprehensive crypto assets and derivatives products, with an average daily trading volume of billions of USD. As an industry leader, we set extremely higher standards for our trading systems. In addition to the upgrade on our trading system in August 2018, we have implemented our next-generation Lightning 2.0 system with world-leading performance after multiple upgrades. The key features of the Lightning 2.0 upgrade are as follows:
At the early development stage of crypto trading systems, platforms usually retrieve details of a bid order of the counterparty by auto-matching it in the database until the order expires or is filled. The system then calculates the traded amount and generates a transaction entry after the matching. This method could ensure data consistency but failed to deal with many market requests at the same time because of its long processing time.
Our next-generation trading system, Lightning 2.0, has adopted the latest in-memory matching technique, where our system stores order data in-memory in the order matching engine during auto-matching, and less frequent access to the database during trading. All matching outcomes and intermediate data are also stored in-memory, which can reduce the quantities of inputs and outputs involved, hence significantly boost the order matching speed.
Although memoization can greatly reduce trading latency, crypto trading systems may risk losing data due to the suspension of power supply. To solve this issue, we take the event sourcing approach to persist the state of a business entity and store data in an event-centric way. The trading system traditionally stores data of the current state in the database, but events are stored to reflect state changes in the event sourcing approach, which enables the system to rebuild the state. The system periodically takes snapshots of the state, and re-orders the events after snapshots are created when rebuilding is required.
Moreover, modern central processing units (CPUs) access data in-memory at a slower speed than expected. According to a test, it takes only 1/7 of time to retrieve data from the L2 Cache of a CPU compared to the in-memory matching technique. In order to further reduce latency, it is important to understand how to make good use of the CPU cache. The unit of data transfer is the cache line, which is usually 64 bytes. While the CPU loads data in-memory, it transfers adjacent data in 64 bytes into the cache. Accordingly, we have made the following improvements to our Lightning system by controlling the distribution of in-memory data:
- Control the in-memory distribution by packing together pieces of data that are required to be processed continuously. After all data are put together, only the first load from the in-memory storage to the cache is required while reading multiple pieces of the data. Afterwards, subsequent reads can hit the cache to improve system performance.
- Control in-memory distribution by putting data that may change at a higher rate (such as data on counters) on different cache lines. When multiple CPUs modify different bytes in a single cache line at the same time, false sharing occurs. For instance, after CPU1 modifies its own data, CPU2 must reload the entire cache line when it reads its own data again because the data in the cache line has been updated. As a result, both CPUs need to wait for each other. That is why we store data in different cache lines by way of padding to avoid this issue.
2. Publish–subscribe model and binary protocol
The two main types of messaging models are as follows:
In the publish-subscribe model, a queue is used for messaging. When a service needs to request other services, the information on the request is encapsulated into a message and placed onto the queue. Other services will subscribe to the message queue to obtain the information and process the request.
In the request-response model, the client and the server are strongly coupled together. They are required to be available at the same time. The client can only wait until the server completes processing the request, which lowers its processing speed. However, in the publish-subscribe model, request processing is complete after the publisher places the message onto the queue. The publisher is decoupled from the subscriber. On the other hand, if the subscriber’s service is interrupted, the message persists on the queue and processing continues when his service resumes without the need for the publisher to resend the message, thus enhancing the reliability of system communication. Therefore, this pattern is adopted in almost all scenarios to improve our Lightning 2.0 system’s availability and throughput.
After we select the request-response pattern, the next step is choosing a suitable information exchange format. The essence of communication is to exchange messages, usually including data. Different exchanging formats have different speed of transmission and levels of communication evolvability, as well as use different programming languages. Therefore, it is a key consideration in designing a trading system.
The shortcomings of a text-based communication protocol are obvious. It easily generates errors and is bandwidth consuming when the parsing of a large text file happens, which does not work well for trading systems that are extremely sensitive to efficiency and performance issues. A binary protocol, however, can be easily used for parsing, so that generates better performance. Therefore, we have adopted the binary protocol in our Lightning 2.0 system.
3. Horizontal scaling
In order to improve and expand the processing capability of a trading system, horizontal scaling and vertical scaling are both desired. Vertical scaling refers to server upgrades, while horizontal scaling means that the addition of servers. The hardware performance of a server is subject to human production capacity. While the hardware configuration (hardware performance) of a server reaches a certain level (limit), it cannot be further improved, hence horizontal scaling is the only option. However, the horizontal scaling approach might lead to load balancing. How to reasonably distribute the loads of the entire system to different servers?
The first consideration is the data race. Although the addition of servers can improve the system’s capability to process data in parallel, its processing capacity cannot be still effectively improved if an unreasonable distribution occurs since parallel computing may make its servers to frequently race for the same data.
A trading system basically stores order, fund, and position data. To lower the number of data races, load sharding is performed to partition those data into shards according to the number of our users available. Users’ order, fund, and position data are independently processed, which helps avoid data races. What’s more, we further optimized our system by adding a round of batch processing for each shard to enhance the processing capacity of our system. On the other hand, derivatives trading pair margin data is another target to undergo load sharding. For a user, each trading pair is completely independent. In this way, we employ load sharding in two phases. When our system needs more servers, load rebalancing is used based on sharding to achieve the flexibility of system expansion.
4. System Scaling
A basic way to enhance the maintainability and scalability of a trading system is to separate its functionality. In this upgrade, we further split our system’s functionality into 3 modules, namely order matching, counter, and risk control. Each module contains its own internal data and status. Specifically, the order matching module is responsible for maintaining the order book and the counter module stores data on positions and account balances, while the risk control module performs the function of risk management.
As the modules work with each other to enable the functionality of the entire trading system, a mechanism is required for their communication. There are two options for inter-service communication: data sharing and messaging.
Data sharing is the most basic method that runs in a way where a module updates its data, and another module obtains new data after query. However, this approach has two significant disadvantages. First, if multiple modules make changes to and queries on the same data, it will usually result in data races, during which the response time of the database will be far longer. Second, it is difficult to get a real-time understanding of changes in other modules, and we can only know such changes after the query.
As a result, our Lightning 2.0 system’s modules are designed to save their own data and not to share data with each other. If modules’ internal state changes, the change will be encapsulated into an event and placed onto the event loop. This can reduce coupling and competition between system modules, and they can communicate with each other at an optimal speed after the event is encapsulated, which greatly enhances our system’s communication speed.
III. Lightning 2.0 Data Performance
We have completed a comprehensive upgrade of our Lightning 2.0 system in the second half of 2019. How has its performance improved compared to Lightning 1.0?
Here are the latest statistics of our Hong Kong server testing in November:
In terms of order processing capacity, our system has a peak order processing capacity of 100,000 txn/s, comparable to mainstream trading systems in the global equity market.
The following three indicators are used to test system latency:
We used test data from September and November to compare the pre-upgrade and post-upgrade performance of our trading system (see below). As indicated below, the average ACK latency decreased from 50 ms to 25 ms, the average Live latency went from 134 ms to 63 ms, and the average Cancel latency reduced from 230 ms to 180 ms.
It shows that our Lightning 2.0 trading system has a lower latency.
IV. Industry Leader in Technology
The unlimited scalability, reproducibility, and flexibility of blockchain mean there are a lot more new assets waiting to be discovered. The ongoing development of blockchain technology will transform increasing intellectual property, copyright, and creative assets into crypto in the future. We shall see the market and users looking for higher reliability and performance in trading systems.
As a world-leading cryptocurrency exchange with comprehensive C2C, spot, and derivatives trading services, we are constantly improving our trading products, risk management system, order matching engine, crypto assets storage service, and customer service, we have become the world’s largest crypto derivatives trading platform receiving great popularity with global users. It is our ultimate goal to grow with the blockchain and crypto sectors by committing extra resources to pursue higher trading security and efficiency to further push forward the development of a blockchain-driven world that everyone in the crypto space is dreaming of.
Disclaimer: This material should not be taken as the basis for making investment decisions, nor be construed as a recommendation to engage in investment transactions. Trading digital assets involves significant risk and can result in the loss of your invested capital. You should ensure that you fully understand the risk involved and take into consideration your level of experience, investment objectives and seek independent financial advice if necessary.
Follow OKEx on: