Delivering Results: Alibaba’s DeepETA Improves Prediction for Package Logistics

This article is part of the Academic Alibaba series and is taken from the AAAI 2019 paper entitled “DeepETA: A Spatial-temporal Sequential Neural Network Model for Estimating Time of Arrival in Package Delivery System” by Fan Wu and Lixia Wu. The full paper can be read here.

For e-commerce platforms, accurately predicting when online purchases will arrive is about more than serving customers. Reliable forecasting can help establish performance metrics for the couriers who make these deliveries while maximizing their efficiency, steadily reducing the time it takes to deliver each package. Still, despite advances in machine learning, sending multiple deliveries over variable routes involves complexities that have limited traditional approaches, challenging developers to look beyond simpler models for point-to-point scenarios like ride sharing.

Overview of the ETA prediction problem at one point in a sequence of deliveries (time t) and the subsequent point in the sequence (time t+1)

Now, researchers at Alibaba have advanced the DeepETA spatial-temporal sequential neural network to better account for factors such as the number of packages to be delivered, their delivery sequences, and the regularity of the delivery routes involved. As an end-to-end network that uses time-variant and -invariant route features to provide a probable delivery time, the model is already outperforming previous methods in experiments with real logistics data from urban China.

Tackling the Spatial-Temporal Problem: Solution Architecture

Overview of the three-module DeepETA framework

To process complex sequential information that can impact delivery times, the model’s Latest Route Encoder first applies LSTM (Long Short-Term Memory) cells to existing Recurrent Neural Network (RNN) methods to overcome their intrinsic “exploding” and vanishing gradient problems. It further incorporates a novel Spatial Encoder component that uses Geohash encoding to represent locations based on their inherent geographical attributes, with the additional benefit of reducing the volume of 0/1 input data the networks process.

To capture recurring mobility patterns in historical delivery route data, the Frequent Pattern Encoder extracts spatial-temporal features for these routes and passes these to an attention-based layer. Using the latest route vector from the previous module, it then outputs a historical pattern vector reflecting data for both delivered and to-be-delivered packages on the delivery route.

Finally, DeepETA’s Jointly Training and Prediction module concatenates outputs from the Latest Route Encoder and the Frequent Pattern Encoder with time-invariant features of the delivery route, then feeds these to a fully connected layer to generate the final output. The mean square error (MSE) is sensitive to large values. To reduce the impact of outlier data,such as cases where a customer who is not home requires the courier to make a second trip, the module combines mean absolute percentage square error (MAPSE) and MSE, as the loss function.

Experimental Results

The MAPE score was used to indicate the deviation between the predicted ETA and the ground truth, while the RMSE score was used as an absolute value for models’ respective performances. As effective ETA prediction equates to more efficient delivery, the latter was reported as a total time in minutes, with the shortest timespan denoting the best performance.

DeepETA improved on the highest performance of any baseline method by 13.8% in RMSE and 16.5% in MAPE. Whereas traditional methods focus exclusively on modeling each route, DeepETA further improves on these capabilities through a sampling method in its attention layer that extracts high level features to represent historical trajectories, as well as LSTM-based layer that extracts sequential features from raw routes.

The full paper can be read here.

Alibaba Tech