Fight Peak Data Traffic On 11.11: The Secrets of Alibaba Stream Computing

Alibaba’s Hangzhou Campus during the 11.11 Global Shopping Festival

Application scenarios of stream computing

The Department of Data Technology and Products acts as Alibaba’s centralized data platform, and the batch and streaming data it generates serves for multiple data application scenarios within the group. During the 2017 11.11 Shopping Festival (as with previous years), the real-time data for media displays, real-time business intelligence data for merchants, and various live studio products for executives and customer service representatives originate from the Big Data Business Unit of Alibaba.

Data pipeline architecture in stream computing

After experiencing a flood of data traffic in recent years, the Alibaba stream computing team has gained extensive experience in engine selection, performance optimization, and the development of streaming computing platforms. As shown below, they have established a stable and efficient data pipeline architecture.

Computing engine upgrade and optimization

In 2017, the Alibaba streaming computing team fully upgraded their stream computing architecture, migrated from Storm to Blink, and optimized the new technology architecture with more than twice the peak stream processing capability and more than five times processing capability on average.

Optimized state management

A huge number of states are generated during stream computation. They were formerly stored in HBase, but now they are stored in RocksDB, reducing network consumption by local storage and dramatically improving the performance to meet the requirements of fine-grained data computation. The number of keys can now reach hundreds of millions.

Optimize checkpoint (snapshot / checkpoint) and compaction (merge)

The states will get bigger and bigger over time, so if a full checkpoint is performed every time, the pressure on the network and disks would be extremely high. This means, in data computation scenarios, that optimizing the RocksDB configuration and using incremental checkpoints can significantly reduce network transmission and disk reads and writes.

Asynchronous sink

By making sink asynchronous, the CPU utilization can be maximized to significantly improve TPS.

Abstracted public components

In addition to the engine-level optimization, an aggregation component based on Blink was developed for the centralized data platform (currently, all stream tasks on the centralized data layer are implemented based on this component). This component provides commonly used functions in data computation and abstracts the topology and business logics into a JSON file. This way, development configurations can be carried out by controlling the parameters in the JSON file, making it much easier to develop, and shortening the development cycle. For example, the development workload was at 10 people per day, but with the componentization, the workload dropped to just 0.5 people per day, which is good for both the product managers and the developers. Also, the unified component improves task performance.

1. Dimension merge

Many statistical tasks need to perform computing with day-level granularity and hour-level granularity. Previously, they were carried out separately through two tasks. However, now the aggregation component merges those tasks and shares the intermediate states, reducing network transmission by over 50%, simplifying the computation logic, and saving CPU consumption.

2. Streamlined storage

For the key values stored in RocksDB, an index-based encoding mechanism was designed to effectively reduce state storage by more than half. This optimization effectively reduces the pressure on network, CPU, and disks.

3. High-performance sorting

Sorting is a very common scenario in streaming tasks. The Top component uses the PriorityQueue (priority queue) in the memory and the new MapState (state management provided by Blink Framework) feature in Blink to dramatically reduce the times of serializations and improve performance by 10 times.

4. Batch transmission and write

When writing the final result tables to HBase and DataHub, if every single record processed is written to the remote sink, the throughput would be greatly limited. Through the time-trigger or record-trigger mechanisms (timer function), the components can perform batch transmission and batch write (mini-batch sink). This can be flexibly configured based on actual business latency requirements, increasing task throughput and lowering pressure on the service end.

Data protection

For high-priority applications (24/7 services), cross-data center disaster recovery capability is required, so when one pipeline is faulty, another pipeline can be switched to at the second level. As shown in the figure below, the pipeline protection architecture on the entire real-time centralized data layer.

Real-time data pipeline during 11.11 Global Shopping Festival

Alibaba Tech

First hand, detailed, and in-depth information about Alibaba’s latest technology. Follow us:



Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Alibaba Tech

Alibaba Tech

First-hand & in-depth information about Alibaba's tech innovation in Artificial Intelligence, Big Data & Computer Engineering. Follow us on Facebook!