Meet the world’s largest E-commerce search engine: Taobao’s CLOES algorithm
This article is part of the Academic Alibaba series and is taken from the paper entitled “Cascade Ranking for Operational E-commerce Search” by Shichen Liu, Fei Xiao, Wenwu Ou and Luo Si, accepted by the 2017 Conference of the Association for Computing Machinery’s Special Interest Group on Knowledge Discovery and Data Mining. The full paper can be read here.
Taobao, a household name of E-commerce platform in China, has a hugely versatile search engine that plays a crucial role in guiding shoppers to products, and dealing with a huge amount of user interactions in an incredibly short space of time, especially during big sales events like Alibaba’s annual Double 11 Global Shopping Festival (the 2017 event achieved total sales of $25 billion).
In June 2017, for the first time, the Alibaba technical support team released their research on Double 11 search results. Using a proprietary Cascade model in a Large-scale Operational Ecommerce Search (CLOES) ranking algorithm, the CPU costs for the 2016 Double 11 had been decreased by approximately 45%. At the same time, average search latency was reduced by 30%, from 33 milliseconds to 24 milliseconds and CLOES had increased gross merchandise volume (GMV) by nearly 1%.
The search method uses a sequence of increasingly complex ranking models to progressively filter data items and refine the ranking order. The thought was to apply ranking models with limited features to eliminate the most irrelevant data items in the initial stages and, in later stages, use more expansive rank models to obtain accurate rankings. Keeping in mind the distinct service-oriented nature of ecommerce, the algorithm was also designed to restrict the engine’s latency, and ranks results to provide a good user experience.
Experiments in both simulated and operational environments attest to the algorithm’s effectiveness in ranking accuracy and computing performance, especially when compared to conventional algorithms. The first major test came during the 2016 Double 11 Global Shopping Festival, for which the team introduced bulk amounts of accurate and time-consuming computation properties (including reinforced learning and deep learning properties), resulting in a 40% reduction of engine sorting stress and an overall improvement in sorting.
Given the site’s user and merchandise volume, number of guided transactions, transactions by click, search queries, and visit quantities per second (QPS), Taobao’s search system is, without a doubt, the world’s largest ecommerce search engine. Naturally, such a search engine must be engineered to deal with a large number of user interactions, especially during high-traffic periods like the Double 11 Global Shopping festival.
Most search engines are designed for the primary purpose of increasing guided clicks. However, the ultimate aim of an ecommerce sorting function is to facilitate higher transactions and sales volumes, for which the Taobao search algorithms must consider numerous practical issues.
The first concern would be the resource optimization of search result sorting with limited datacenter utilization. Secondly, the delivery of a high-quality search experience needs to be provided in the shortest possible processing time, and with the most relevant item selection. Lastly, the sorting function must ensure that multiple platform objectives are achieved including clicks, quantities sold and overall sales.
The full paper can be read here.
First hand, detailed, and in-depth information about Alibaba’s latest technology → Search “Alibaba Tech” on Facebook