Who Knows Whom: Connecting the Right People with Interactive Paths Embedding
This article is part of the Academic Alibaba series and is taken from the paper entitled “Interactive Paths Embedding for Semantic Proximity Search on Heterogeneous Graphs” by Zemin Liu, Vincent W. Zheng, Zhou Zhao, Zhao Li, Hongxia Yang, and Minghui Wu, accepted by KDD 2018. The full paper can be read here.
One of the most appealing features of web platforms is users’ ability to connect with others. On social media, for example, a user might not even have to actively search for friends before the platform recommends new connections. Beneath the surface of these web platforms lies a vast network of connections between users, and a large portion of this network relies on “semantic proximity search”: taking an object in the network as the query and ranking other objects according to semantic relations.
Semantic proximity search looks at features such as location, place of employment, and school to determine the semantic relationships implied through these connections. From there, the search takes the user as a query and asks which other users are likely to be neighbors, coworkers, or classmates, ranking them accordingly. These rankings are then used to power features such as recommended connections on social media, advisor/advisee connections on bibliography networks, and linking user identities on e-commerce platforms.
Semantic proximity searches aren’t perfect, however. Semantic relationships on heterogeneous lists aren’t always explicit, and there can be missing links between objects.
Prior research on semantic proximity has tried to measure semantic proximity with paths connecting the query object and target object. However, these paths are weakly coupled in modelling, with each path processed individually. Their outputs are only aggregated in the final stage, limiting the model’s ability to form a complete picture of the interdependence between objects.
Alibaba’s tech team, in collaboration with researchers from Zhejiang University and the Advanced Digital Sciences Center in Singapore, has developed Interactive Paths Embedding (IPE) to more strongly couple paths for semantic proximity search, finding connections between users that may go unnoticed by current baselines.
Alibaba’s team of researchers introduced the concept of interactive paths, processing multiple paths at once and adding dependencies between them. As a result, these paths are considered to be strongly coupled. These interactive paths are then embedded into a low-dimensional vector that can capture the full scope of the semantic relationships between users.
From there, researchers utilized a cycle-free shuffling mechanism. Cycles in a graph structure are not desired because they make it more difficult for two nodes to reach one another. This mechanism shuffles the order of paths to remove different cycles and maximize path efficiency. Then, a Gated Recurrent Unit (GRU) architecture embeds the interactive paths and allows each GRU to model the interdependencies from the other GRUs. Finally, the interactive-paths structure embedding output is aggregated as a single vector which can then be used to estimate semantic relationship proximity.
Putting IPE in Practice
To test the effectiveness of IPE in the field, Alibaba’s tech team sought out different types of heterogeneous networks such as LinkedIn, Facebook, DBLP, and Taobao. In the experiment, IPE and several other semantic user search baselines were tasked to define different types of relationships based on sets of traits unique to each network. Researchers constructed an ideal ranking for each test query user and each desired semantic relation. They compared this ideal ranking against the ranking generated by various state-of-the-art semantic user search algorithms.
In all of these tests, IPE outperformed not only the competing baselines but also degraded versions of itself, often by a significant margin. This validates the interactive-paths structure, and opens the door to expanding IPE to handle attributes and dynamics in heterogeneous networks for semantic proximity search.
The full paper can be read here.