This article is part of the Academic Alibaba series and is taken from the paper entitled “An End-to-end Model of Predicting Diverse Ranking On Heterogeneous Feeds” by Zheng Gao, Zhuoren Jiang, Zizhe Gao, Heng Huang and Yuliang Yan, accepted by SIGIR 2018. The full paper can be read here.
Often when browsing our online shopping platform of choice, we’ll make a search for the item or even type of item we’re looking for, and expect a wide range of accurate results with a range of information to enable us to choose the right product for our needs. That is where the boom of multimedia content feeds come into play. Users are shown a wide variety of heterogeneous information, from posts and item lists to images and videos that give detailed information about the product, such as maintenance guidelines, usage demonstrations, and more.
Currently, two types of search engine are being used in e-commerce: item search engine (ISE), which give the results back in a ranking list of items, and content search engine (CSE), which recommend and rank multimedia content. These are used together so both item and feed content can be ranked, increasing click-through rates and user attention.
However, doing this comes with two challenges: the diversity of feed types makes it difficult to rank heterogeneous feeds, and cross-domain knowledge between ISE and CSE needs to be explored more to improve CSE ranking. Improving the quality of heterogeneous feed ranking in CSE can help users when they take advice from feeds to inform their purchasing decisions.
Building on Previous Work
Despite the fact that heterogeneous feed ranking is a new topic, previous studies on similar problems offer approaches that can be borrowed, including using the HEG5 model for cross-domain challenges, or introducing novel cross-domain rankings. The Multi-armed bandit (MAB) framework is also used to alleviate these challenges, as well as deep learning techniques for recommendation and information retrieval. Alibaba’s researchers have built on these approaches by splitting the challenges into two phases: Homogeneous Feed Ranking and Heterogeneous Type Sorting, focusing mainly on the latter phase while using an existing well-known model called Deep Structured Semantic Model (DSSM) to tackle the former.
Tackling Heterogeneous Type Sorting
To solve heterogeneous type sorting tasks, two novel models were used: An independent Multi-Armed Bandit (iMAB) model was designed to rank feeds assuming slots were independent from each other and generated a global model for feed ranking. While a personalized Markov Deep Neural Network (pMDNN) model was designed to jointly select feed types for all slots and offer a personalized feed type result for each user.
The iMAB model solely relied on CSE records and generated the feed type for each slot independently in a statistical estimation method. The pMDNN model, on the other hand, integrated both ISE and CSE historical records to build up user profiles, and used three-layer neural networks to generate feed types for each slot at the same time. The generated feed types of both models were utilized in a DSSM to predict relevant feeds for each slot.
The results for A/B testing conducted on an Alibaba platform showed that when using user preference and feed type dependency as metrics, the pMDNN model outperforms the iMAB for heterogeneous feed ranking in the CSE. Specifically, the benefits of the pMDNN model include the integration of domain knowledge from both ISE and CSE to generate feed ranking in CSE, and the use of an end-to-end model to solve the challenge.
The full paper can be read here.