Summary: | The intelligent environment monitoring network, as the foundation of ecosystem research, has rapidly developed with the ever-growing Internet of Things (IoT). IoT-networked sensors deployed to monitor ecosystems generate copious sensor data characterized by nonstationarity and nonlinearity such that outlier detection remains a source of concern. Most outlier detection models involve hypothesis tests based on setting outlier threshold values. However, signal decomposition describes stationary and nonstationary relationships sensor data. Therefore, this paper proposes a three-level hybrid model based on the median filter (MF), empirical mode decomposition (EMD), classification and regression tree (CART), autoregression (AR) and exponential weighted moving average (EWMA) methods called MF-EMD-CART-AR-EWMA to detect outliers in sensor data. The first-level performance is compared to that of the Butterworth filter, FIR filter, moving average filter, wavelet filter and Wiener filter. The second-level prediction performance is compared to support vector regression (SVR), K-nearest neighbor (KNN), CART, complementary ensemble EEMD with CART and AR (EEMD-CART-AR) and ensemble CEEMD with CART and AR (CEEMD-CART-AR) methods. Finally, EWMA is compared to Cumulative Sum Control Chart (CUSUM) and Shewhart control charts. The proposed hybrid model was evaluated with a real dataset from the hydrometeorological observation network in the Heihe River Basin, yielding experimental results with better generalization ability and higher accuracy than the compared models, and providing extremely effective detection of minor outliers in predicted values. This paper provides valuable insight and a promising reference for outlier detection involving sensor data and presents a new perspective for detecting outliers.
|