Soft Sensor Modeling Method by Maximizing Output-Related Variable Characteristics Based on a Stacked Autoencoder and Maximal Information Coefficients

The key factors required to establish a precise soft sensor model for industrial processes include selection of variables affecting vital indicators from a large number of online measurement variables and elimination of the effects of unrelated disturbance variables. How to compress redundant inform...

Full description

Bibliographic Details
Main Authors: Yanzhen Wang, Xuefeng Yan
Format: Article
Language:English
Published: Atlantis Press 2019-09-01
Series:International Journal of Computational Intelligence Systems
Subjects:
Online Access:https://www.atlantis-press.com/article/125917186/view
Description
Summary:The key factors required to establish a precise soft sensor model for industrial processes include selection of variables affecting vital indicators from a large number of online measurement variables and elimination of the effects of unrelated disturbance variables. How to compress redundant information and retain the unique characteristic information contained by the selected variables is worthy of in-depth research. A novel soft sensor modeling method based on weighted maximal information coefficients (MICs) and a stacked autoencoder (SAE), hereinafter referred to as MICW-SAE, is proposed in this work. In our model, the MICs between each input and output variable are calculated and compared with the threshold before training each network in SAE. Then, input variables with low MICs are selected, and the average MIC index is calculated using other input variables. If the index is higher than the second threshold, the MIC of this specific variable is set to 0. Finally, the weights of all input variables are determined in accordance with the scale and placed into the loss function for training. The Boston house-price and naphtha dry point temperature datasets are used to prove the prediction ability of our model. Results demonstrate that MICW-SAE can enhance the output-related features of the input variables. Moreover, redundant information that can also be represented by other input variables are identified and excluded.
ISSN:1875-6883