Summary: | Software failures are a tangible and imminent problem in enterprise software systems. Failures are usually detectable via monitoring threshold values of some critical indicators. At the same time, failure prevention or mitigation is often not possible due to a lack of time for any actions before a failure. It is necessary to predict failures in a timely manner using application status logs. For this purpose, different approaches to failure prediction have been studied, and one of them is based on the detection of foregoing anomalies in data on states of applications. The paper proposes several machine learning approaches to anomaly detection for failure prediction. The best results of failure prediction have been achieved with the gradient boosting method over decision trees with application of the sliding window method and excluding pieces of time series prior to anomalies in log data. This allows finding failures in considered data at a reasonable time before the system fails. In case of a lack of labeled data for training, an unsupervised approach using isolating forests and an automatic data labeling approach are proposed.
|