This article is part of the Academic Alibaba series and is taken from the paper entitled “Entire Space Multi-Task Model: An Effective Approach for Estimating Post-Click Conversion Rate” by Xiao Ma, Liqin Zhao, Guan Huang, Zhi Wang, Zelin Hu, Xiaoqiang Zhu, and Kun Gai, accepted by SIGIR 2018. The full paper can be read here.
Measuring the effectiveness of an advert is easier said than done. This is true even in the e-commerce industry, where data on users is much more easily gathered than in offline sales channels. With no way to measure or quantify the emotional resonance of an advert, there is always a degree of guesswork in measuring advertising success.
Advertisers and the platforms they work with typically look at data on the outcomes of the advertising process — from first encounter (or “impression”), to clicking on the ad, to making a purchase — to develop metrics measuring how good an advert is at connecting users with products.
The Problem with CVR
Two of the most common metrics used for measuring ad success are click-through rate (CTR) and post-click conversion rate (CVR). Simply put, CTR measures the effectiveness of an ad, product recommendation, or similar in driving traffic to the site, while CVR measures its effectiveness at driving sales.
Of the two, CVR is proving increasingly crucial for advertisers. In optimized cost-per-click (OCPC) advertising, it is used to adjust the bid price per click and balance the interests of both the platform and the advertisers. In recommendation systems, it is used to balance users’ click preferences and purchase preferences.
Conventional CVR modeling draws on tried-and-tested CTR estimation techniques, which apply popular deep learning methods, and can already provide satisfactory performance in industrial application. Nevertheless, this approach has its drawbacks, the main one being sample selection bias (SSB).
SSB arises because conventional CVR models are trained only with samples of clicked impressions, but utilized to make inferences on the entire space about all impressions.
Counting First Impressions with ESMM
The Alibaba tech team has proposed a new CVR model that considers the full sequential pattern of user actions, from impression, to click, to conversion. Called the “entire space multi-task model” (ESMM), the new system eliminates the issue of SSB by redefining how CVR is calculated. It introduces two auxiliary tasks that predict the post-view CTR and the post-view click-through & conversion rate (CTCVR). Instead of training the CVR model directly with samples of clicked impressions, ESMM treats pCVR as an intermediate variable which, multiplied by pCTR, equals pCTCVR. Crucially, both pCTCVR and pCTR are estimated over the entire inference space that includes samples of all impressions.
ESMM also employs transfer learning to combat the issue of data sparsity (DS), another common drawback of traditional CVR modeling. The feature representation parameters from the CVR network are shared with CTR network, allowing the CTR network to be trained with much richer samples than are typically available.
Test Results: Nothing Short of Impressive
The ESMM system was tested using a dataset of 8.9 billion samples taken from the traffic logs of a product recommendation system on Taobao, Alibaba’s most well-known e-commerce platform in China.
When ranked against the baseline system and four competing approaches, ESMM consistently came out on top. An improvement of 0.1% above the baseline is considered significant, and in one set of tests the new system displayed a performance increase of 2.32%. This shows ESMM has promise for wide industrial application in future.
The full paper can be read here.