This article is part of the Academic Alibaba series and is taken from the AAAI 2019 paper entitled “RobustSTL: A Robust Seasonal-Trend Decomposition Algorithm for Long Time Series” by Qingsong Wen, Jingkun Gao, Xiaomin Song, Liang Sun, Huan Xu and Shenghuo Zhu. The full paper can be read here.
In the field of time series analysis, decomposition is the process of analyzing data in a time series to extract phenomena like trends (long-term trajectory) and seasonality (short-term cycles) as well as noise and anomalies. Models that perform this task well are extremely useful, as they can be used for forecasting or anomaly detection.
Of the various seasonal-trend decomposition models that exist, each has its strengths and weaknesses making it well-suited to a specific industrial application. For example, X-13-ARIMA-based models are popular in statistics and economics, as they are good at handling monthly and quarterly data. However, so far, no proposed model has provided a suitable solution to the unique challenges of analyzing time series data in the Internet of things (IoT) era. Specifically, existing decomposition methods cannot handle seasonality fluctuation and shift adaptively.
Now, Alibaba’s tech team has developed RobustSTL, a novel form of the classic STL (Seasonal-Trend decomposition using Loess) that is designed to better cope with the volume and variance of data from IoT devices to extract seasonality and trends more precisely.
The Problem of a Day’s Worth of Data
Alibaba’s researchers found that existing decomposition methods typically struggle with the long and fluctuating seasonality of data from IoT devices, as well as with detecting anomalies in this data. The first part of this may sound counterintuitive — most people follow a daily rhythm with regards to the IoT devices they interact with, so how is it that models that cope with monthly and quarterly financial analyses struggle with the comparative day-to-day monotony of consumers and their electronics?
The key is in the number of data points. Take the daily closing values of the Dow Jones Industrial Average index as an example, a monthly seasonality would correspond to 28–31 data points with a fluctuation of 3 (the difference between the upper and lower ends of the range), while a quarterly seasonality would correspond to between 90–92 data points.
In the IoT context, the seasonality is normally just one day — but measurements are taken every minute. That means there are around 1,440 data points per season, and that a fluctuation of just one hour corresponds to 60 data points.
Given that most methods are geared towards extracting seasonality that consists of tens of data points, the problem of extracting seasonality and identifying trends in the IoT context is a major challenge. Meanwhile, most methods have a tendency to misinterpret anomalies in the IoT time series as noise. Given that idle-period spikes are still much lower than peak-period averages, this means they offer limited utility in anomaly analysis.
A Robust Solution
To address these challenges, the Alibaba team developed RobustSTL, which treats trend extraction as a regression problem solvable using the least absolute deviation (LAD) loss with sparse regularization. It also proposes a method called non-local seasonal filtering to extract seasonality, and uses an iterative approach to produce trend and seasonality estimates that are as accurate as possible.
The team tested RobustSTL on the synthetic dataset shown above, which was developed to mimic complex real-world scenarios with complex season and trend shifts, anomalies and noise. RobustSTL outperformed all other models and approaches in the test.
Similarly, RobustSTL achieved state-of-the-art performance on real-world datasets.
The full paper can be read here.