Summary: | Failure prediction is a key component of modern autonomic systems. A crucial decision to take when performing it is which observation window to use, this is, to decide the time period in the past that will be taken into account in order to accurately predict. Currently, this decision is a highly manual process, dependent on expert knowledge. To alleviate this problem, we propose the usage of a customized genetic algorithm alongside a machine learning technique, random forests, which optimizes a novel, multiple observation window schemes that allows for more modeling complexity than other schemes present on the literature. We validate it using ten different events extracted from two real, industrial data sets: one from a high performance computing environment and one from a computer network. We show that our algorithm creates models that optimize performance while reducing the observed time automatically with minimal user input required.
|