Summary: | Data streams have become ubiquitous over the last two decades; potentially unending streams of continuously-arriving data occur in fields as diverse as medicine, finance, astronomy and computer networks. As the world changes, so the behaviour of these streams is expected to change. This thesis describes sequential methods for the timely detection of changes in data streams based on an adaptive forgetting factor framework. These change detection methods are first formulated in terms of detecting a change in the mean of a univariate stream, but this is later extended to the multivariate setting, and to detecting a change in the variance. The key issues driving the research in this thesis are that streaming data change detectors must operate sequentially, using a fixed amount of memory and, after encountering a change, must continue to monitor for successive changes. We call this challenging scenario "continuous monitoring" to distinguish it from the traditional setting which generally monitors for only a single changepoint. Additionally, continuous monitoring demands that there be limited dependence on the setting of parameters controlling the performance of the algorithms. One of the main contributions of this thesis is the development of an efficient, fully sequential change detector for the mean of a univariate stream in the continuous monitoring context. It is competitive with algorithms that are the benchmark in the single changepoint setting, yet our change detector only requires a single control parameter, which is easy to set. The multivariate extension provides similarly competitive performance results. These methods are applied to monitoring foreign exchange streams and computer network traffic.
|