This article is part of the Academic Alibaba series and is taken from the paper entitled “Accelerating E-Commerce Search Engine Ranking by Contextual Factor Selection” by Yusen Zhan, Qing Da, Fei Xiao, Anxiang Zeng, and Yang Yu. The full paper can be read here.
Every search engine (Google, Baidu, etc.) and e-commerce site (Taobao, Amazon, etc.) uses a different method to rank the results of its users’ requests, but one thing they all have in common is the massive amount of data involved in the ranking process. Taobao, for instance, processes billions of searches for hundreds of millions of users every single day.
It is essential for search results to be simultaneously accurate, reliable and highly efficient, with the results ranked and displayed as fast as possible and with minimal resource consumption.
This often proves problematic. For example, the quality and accuracy of a ranking can be improved by introducing technologies such as deep neural networks, but adding further, complex factors to an already encumbered process only results in higher resource consumption and increased system latency.
To address the challenges of large-scale traffic requests, a search engine may also downgrade the service level in terms of effectiveness, reducing the number of recalled items or taking services offline. While this reduces delays and unavailability, it also seriously impacts user experience.
Therefore, the Alibaba tech team sought to design a solution capable of ranking results effectively and efficiently while also providing a seamless user experience.
First, the team posited that the number of factors involved in a search system’s ranking process is excessive and that there must therefore be many redundant steps that use valuable system resources — a theory that the team subsequently tested and proved in real operating environments. The team then concluded that, by carefully selecting a smaller set of more precise factors, the effectiveness and efficiency of the ranking process could be greatly improved.
This led the team to look into contextual factor selection (CFS), a problem whereby a subset of effective factors is applied in different searches to find the perfect balance between result quality and system latency. Solving the CFS problem — essentially, selecting the optimal factors — is a standard combinatorial optimization problem, meaning that it can be improved through reinforcement learning.
As such, the tech team designed a CFS solution they are calling RankCFS, which applies a reinforcement learning framework to optimize search result rankings.
The RankCFS framework was compared in offline scenarios to other methodologies and was found to considerably reduce resource consumption, as it uses comparatively fewer factors. It was also tested in an online operating environment, namely Alibaba’s e-commerce site Taobao, during the 2017 Alibaba Singles’ Day Global Shopping Festival — the world’s largest shopping extravaganza.
The following figures compare system latency on a normal day and on Singles’ Day. The green lines indicate latency without RankCFS optimization, and the red lines indicate the latency when RankCFS is applied.
By combining RankCFS with Taobao’s CLOES algorithm, the system infrastructure handled 325,000 orders a second at its peak. Despite the massive surges in traffic, Alibaba was able to provide a superior search performance and a seamless user experience during the festival, confirming that the CFS approach is capable of outperforming other existing solutions.
The full paper can be read here.
First hand and in-depth information about Alibaba’s latest technology → Search “Alibaba Tech” on Facebook