Alibaba Tech

Oct 31, 2018

4 min read

Calling Attention to Neural Summarization

Text summarization by Alibaba’s AI Tech

In academic fields, when a paper is written or an article gets published, a section that references related work is typically included. This serves to contextualize the author’s contribution, connect people to previous studies within a specific domain, and help readers deepen their understanding of the topic at hand. But to read all of the suggested literature would be an incredibly laborious undertaking.

This is where text summarization comes into play. In machine learning, automatic text summarization refers to the process by which software is used to create a short, exact, and coherent summary of a longer piece of text that manages to contain all of its major points and themes.

Fig. 1: A heterogeneous bibliography graph

However, conventional solutions for text summarization are heavily reliant on human-engineered features, and are becoming increasingly inadequate. In an effort to find a more efficient method, the Alibaba team utilized a heterogeneous bibliography graph to accurately embody the relationship between scientific publications, all within a scalable scholarly database.

Alibaba’s Solution

Sequence-to-sequence — or Seq2Seq — is a general-purpose encoder/decoder framework enabled through the development of neural networks. It can be used in the text summarization, but it makes aligning a section on related work to its source documents quite challenging. The Alibaba team leveraged the Seq2Seq architecture when it developed its neural data-driven text summarizer. Together with the heterogeneous bibliography graph, the team’s study proposed a joint context-driven attention mechanism to measure contextual relevance within long pieces of text.

Fig. 2: Framework of the team’s seq2seq summarizer

The team’s main objective with the Seq2Seq summarizer was to maintain topic coherence between the related work section and its target documents, where both textual and graphic contexts play a role in accurately characterizing the relationship between scientific publications.

As shown in Figure 2, the Seq2Seq summarizer is modeled with a hierarchical encoder and an attention-base decoder. The encoder consists of two major layers, namely a convolutional neural network (CNN) and a long short-term memory (LSTM) based recurrent neural network (RNN). The CNN deals with word-level text to derive sentence-level meanings, which are then used as input for the RNN to handle longer-range dependency within larger units, such as a paragraph.


To assess their approach, the Alibaba team conducted experiments on a dataset that contained over 370,000 papers, around 8,000 of which were used in the evaluation. As exhibited below, experimental results confirmed the validity of the proposed attention mechanism, and demonstrated the consistently superior summarization performance of the team’s model against six other baselines.

Fig. 3: Alibaba’s approach, here denoted as P.S+N+Rteg+EUD, is more effective than other benchmarks

By locating the most important textual information in given source documents, the team found that it was possible to build a faithful, representative related-work literature review of the entire piece. This step has made it considerably easier — and most importantly, faster — for readers to consume large amounts of textual information.

Alibaba Tech

First hand and in-depth information about Alibaba’s latest technology → Facebook: “Alibaba Tech”. Twitter: “AlibabaTech”.

This story is published in Noteworthy, where 10,000+ readers come every day to learn about the people & ideas shaping the products we love.

Follow our publication to see more product & design stories featured by the Journal team.