Developing a Trading Platform that Processes 325,000 Transactions per Second

How Alibaba’s technology supports faster rollout while improving system integrity

Image for post
Image for post

Following the trend set by Alibaba in previous years, the 2017 Double 11 Global Shopping Festival shattered records once again. One of the new milestones was the peak number of transactions — 325,000 per second at one point.

Though this figure is a commercial achievement for Alibaba, it also represents an operational and technical challenge. Reaching transaction volumes of this magnitude means meeting the needs of dozens of business units within the Alibaba group while maintaining the stability and integrity of over 7,000 system applications. The biggest challenge brought by this kind of microservice distributed architecture is ascertainment of business requirements and impact, and quick full-scale implementation and launch. This requires an analysis of the requirements, their technical solutions, coding and testing, and the launch itself. This is also a very complicated cross-team process.

The difficulty is mainly reflected in:

1) The lack of a full-scale management mechanism, and low collaboration efficiency;

2) High barriers to platform entry and the inability of new businesses to try new modes without risking failure;

3) Poor separation between businesses and the platform, which is unable to support the development of the businesses;

4) Lack of reusable business assets.

Pain point 1: Lack of a full-scale management mechanism, and low collaboration efficiency

· The description of requirements is often provided in one simple sentence. A detailed description is generally explained in the form of requirement description documents, emails, and organizational requirements clarification sessions.

· The delivery of requirements is inefficient and requires repeated communication. After these requirements are clarified, they lack effective delivery vehicles in the later design process, and are passed down to developers with several inaccuracies, leading to repeated communication, clarification, and rework.

· Platform capabilities are often unclear, and the evaluation of technical solutions takes a long time. When technicians evaluate the changes made to the platform due to the implementation of requirements, platform capabilities get obscured. Meanwhile, the business and platform codes get mixed together, causing difficulties in evaluating technical evaluation issues such as reuse of platform capabilities, and the number and nature of businesses or systems affected by the changes. This makes it necessary to analyze and evaluate these issues by going through the code repeatedly before business visualization.

· Similar requirements are constructed repeatedly. Personnel changes or turnover make it hard to track or follow changes in requirements. Whenever similar requirements are encountered, the analysis, design, and coding need to be re-performed again.

Pain point 2: High barriers to platform entry, and the inability of new businesses to try new modes without risking failure

Pain point 3: Poor separation between businesses and the platform, which is unable to support the development of the businesses

Pain point 4: Lack of reusable business assets

In response to the above pain points and problems, the Alibaba tech team designed a new trading system based on a Trade Modularization Framework, TMF 2.0. With a brand new architecture and enhanced planning and monitoring functionalities, TMF 2.0 is an altogether more powerful and more robust trading platform, supporting more rapid new business launches while ensuring system stability and integrity.

Conceptualizing the TMF 2.0 Platform

To support more accurate assessments and better ongoing monitoring, TMF 2.0 was designed with six key functionalities in mind:

1. Full-chain business visualization

Business analysts and developers can discuss requirements and analyze impact based on the visualized business diagram. The business rules you see are what are run on the system.

Since there are over 7000+ applications in Alibaba Group. A full-scale business visualization is necessary for efficient analysis.

2. Demand structuring

After the business requirements have been analyzed, they’re further broken down based on capabilities under the relevant business architecture specification. Alibaba’s specifications are used not only for demand, but also for business processes and interfaces. This standardization reduces communication and time costs.

3. Business configuration

Once the business definition has been visualized, business rules can be configured easily. After change requests have been approved, they can be applied rapidly.

4. Business test integration

Because Alibaba’s E-commerce business is characterized by long links involving multiple products from multiple BUs, any changes in business requirements call for upstream and downstream testing through regression verification. If done manually, processes like regression verification and test data preparation result in heavy time costs. Instead, TMF2.0 provides an auto-regression testing function based on business data visualization to ensure sufficient test coverage for normal and abnormal scenarios.

5. Business monitoring

In routine business maintenance, we need to constantly monitor the business dashboard. We not only need the overall indicator data, but the specific business’s indicator data.

6. Business-oriented troubleshooting

Achieving a snapshot of the issue to quickly restore the business track and locate the problem when a business fails.

The framework of the platform was designed with these functionalities in mind.

Building the TMF 2.0 Framework

· Plug-in architecture for platform segregation and business customization

· Unified business identity program

· Segregation of management domain and operation domain

Plug-in Architecture for Platform Segregation and Business Customization

There are three layers to this architecture — business specification, solution realization, and business customization. These are shown and described in more detail below.

Image for post
Image for post

Business Specification

This theoretical model allows performance and reuse of definition and specification tasks like business process definitions, business extension interface definitions, and business entity model specifications.

Solution Realization

Business Customization

Though this architecture is complex, it allows for clearly demarcated responsibilities between different layers while consciously isolating the entire code. During new business deployments, the team first focuses on reuse of underlying business solutions before formulating solutions for different markets, and finally differentiates different parts of the business by solution categories.

Unified Business Identity Program through the Full-chain

The business identity needs to be abstracted through three dimensions — people, goods, and fields. Fields include market type, verticals, and channel source. Business processes and rules can be related once a unique business identity is generated.

The Alibaba team adopted a UIL-based business identification program to create these unified business identities. The overall design is based on standard abstraction models, with customized syntax and unified management models. It effectively identifies 99% of products through four dimensions — sample model, buyer model, seller model, and category model.

Business configuration and deployment can be managed uniformly according to these dimensions once a business identity has been assigned. For this, core elements such as configuration isolation, hot deployment, configuration rollback, and configuration determinism must be implemented.

Segregation of management domain and operation domain

This is necessary because business logic cannot rely on dynamic runtime calculations. Instead, it must be defined and visualized during a static period. During this static period, decisions can also be made to resolve any rule conflicts that appear in business definitions. During runtime, the business rules and conflict decision strategies defined in the static period are then strictly followed in the operations domain.

The following figure shows the architecture used for separating the business and operations domains.

Image for post
Image for post

The business domain defines the business life cycle, business identity, and business objects, which include business processes and business management. Once these operations are completed, configuration files are delivered to operations domain platforms, which automatically resolve them into commands for execution.

How the business domain defines business rules is a complex process. The three core elements in this process are the business identity, business superposition, and conflict decisions. Business superposition refers to identification of business rule conflicts in two dimensions, namely the horizontal and vertical dimensions as shown in the following figure.

Image for post
Image for post

The horizontal dimension is also known as the product dimension. Horizontal considerations include products being used by multiple vertical businesses (or vertical businesses using multiple products) and ascertaining whether a product is valid based on the given business session. For example, the validity of an e-voucher depends on whether the user has put it into use.

The vertical dimension is also known as the industry. Often a specific business object (such as a commodity) can help determine which industry it belongs to in a static period. Business rules of one industry are not automatically imposed on other industries. For example, the payment time-out period for different industries can be set to one day, but if Tmall Car changes its time-out period, this change would not impact other industries.

Determining the complexity of a business based on the quantity of rules involved is framed as a simple calculation:

Total business rules for one business session = one vertical business rule set + n horizontal business rule sets

Therefore, defining and managing businesses requires specific operations to determine cross-sections of vertical and horizontal businesses. This helps find the best solutions to conflicts that arise after these cross-sections have been determined.

The TMF2.0 Key Concept Models

Image for post
Image for post

During the business configuration pipeline, the team considers the domains covered by the relevant business, the functions and products available under the domains, and business points which can be expanded. This requires the support of the domain functionality model. The model allows targeted settings by revealing structured data on the capabilities and extension points within each domain in the platform. Template reveal is carried out through the key view template (shown in the lower part of the figure).

Once business configuration is complete, the configuration data is saved and delivered to the business operation pipeline.

The usage model outlined above greatly streamlines the planning and launching of new businesses.

Business definition is visual, manageable, and configurable across the entire trading platform. Visualization is provided for system capabilities, business processes and rules, and product superposition. Configuring business rules is simple and reliable, following the “What You See Is What You Get” (WYSIWYG) principle. All systems based on TMF 2.0 standards can immediately obtain business configurability without any need for additional development.

Additionally, a comprehensive configuration version management system is provided for quick implementation or rollback of business configurations. Multi-tenancy management enables complete isolation between different business systems through tenants, who are allowed their own data space and configuration push policies.

Business dimension-oriented operation and maintenance protection

1) Fault monitoring by business dimensions

The TMF2.0 business system makes cross-dimensional grouping and differential monitoring easy since it utilizes a unified business identity.

2) Cluster deployment by business dimensions

3) Stability guarantee by business dimensions

Results

Secondly, it has successfully decoupled the platform and the businesses operating on it. This benefits businesses because their customizations are stored only in the business package while the platform remains intact, making releases more flexible. In numerous cases, release for a single business entity has not required any other business entities to regress.

Finally, it has enabled the establishment of business asset libraries. As of now, over 50 business asset libraries have been accumulated, allowing greater efficiency in copying, adjusting, and launching new businesses.

(Original article by Yu Zhenxin虞振昕)

Alibaba Tech

Written by

First-hand & in-depth information about Alibaba's tech innovation in Artificial Intelligence, Big Data & Computer Engineering. Follow us on Facebook!

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store