Improving software model inference by combining state merging and Markov models

Labelled-transition systems (LTS) are widely used by developers and testers to model software systems in terms of their sequential behaviour. They provide an overview of the behaviour of the system and their reaction to different inputs. LTS models are the foundation for various automated verificati...

Full description

Bibliographic Details
Main Author: Alsaeedi, Abdullah
Other Authors: Bogdanov, Kirill
Published: University of Sheffield 2016
Subjects:
004
Online Access:http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.690159
id ndltd-bl.uk-oai-ethos.bl.uk-690159
record_format oai_dc
collection NDLTD
sources NDLTD
topic 004
spellingShingle 004
Alsaeedi, Abdullah
Improving software model inference by combining state merging and Markov models
description Labelled-transition systems (LTS) are widely used by developers and testers to model software systems in terms of their sequential behaviour. They provide an overview of the behaviour of the system and their reaction to different inputs. LTS models are the foundation for various automated verification techniques such as model-checking and model-based testing. These techniques require up-to-date models to be meaningful. Unfortunately, software models are rare in practice. Due to the effort and time required to build these models manually, a software engineer would want to infer them automatically from traces (sequences of events or function calls). Many techniques have focused on inferring LTS models from given traces of system execution, where these traces are produced by running a system on a series of tests. State-merging is the foundation of some of the most successful LTS inference techniques to construct LTS models. Passive inference approaches such as k-tail and Evidence-Driven State Merging (EDSM) can infer LTS models from these traces. Moreover, the best-performing methods of inferring LTS models rely on the availability of negatives, i.e. traces that are not permitted from specific states and such information is not usually available. The long-standing challenge for such inference approaches is constructing models well from very few traces and without negatives. Active inference techniques such as Query-driven State Merging (QSM) can learn LTSs from traces by asking queries as tests to a system being learnt. It may lead to infer inaccurate LTSs since the performance of QSM relies on the availability of traces. The challenge for such inference approaches is inferring LTSs well from very few traces and with fewer queries asked. In this thesis, investigations of the existing techniques are presented to the challenge of inferring LTS models from few positive traces. These techniques fail to find correct LTS models in cases of insufficient training data. This thesis focuses on finding better solutions to this problem by using evidence obtained from the Markov models to bias the EDSM learner towards merging states that are more likely to correspond to the same state in a model. Markov models are used to capture the dependencies between event sequences in the collected traces. Those dependencies rely on whether elements of event permitted or prohibited to follow short sequences appear in the traces. This thesis proposed EDSM-Markov a passive inference technique that aimed to improve the existing ones in the absence of negative traces and to prevent the over-generalization problem. In this thesis, improvements obtained by the proposed learners are demonstrated by a series of experiments using randomly-generated labelled-transition systems and case studies. The results obtained from the conducted experiments showed that EDSM-Markov can infer better LTSs compared to other techniques. This thesis also proposes modifications to the QSM learner to improve the accuracy of the inferred LTSs. This results in a new learner, which is named ModifiedQSM. This includes considering more tests to the system being inferred in order to avoid the over-generalization problem. It includes investigations of using Markov models to reduce the number of queries consumed by the ModifiedQSM learner. Hence, this thesis introduces a new LTS inference technique, which is called MarkovQSM. Moreover, enhancements of LTSs inferred by ModifiedQSM and MarkovQSM learners are demonstrated by a series of experiments. The results from the experiments demonstrate that ModifiedQSM can infer better LTSs compared to other techniques. Moreover, MarkovQSM has proven to significantly reduce the number of membership queries consumed compared to ModifiedQSM with a very small loss of accuracy.
author2 Bogdanov, Kirill
author_facet Bogdanov, Kirill
Alsaeedi, Abdullah
author Alsaeedi, Abdullah
author_sort Alsaeedi, Abdullah
title Improving software model inference by combining state merging and Markov models
title_short Improving software model inference by combining state merging and Markov models
title_full Improving software model inference by combining state merging and Markov models
title_fullStr Improving software model inference by combining state merging and Markov models
title_full_unstemmed Improving software model inference by combining state merging and Markov models
title_sort improving software model inference by combining state merging and markov models
publisher University of Sheffield
publishDate 2016
url http://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.690159
work_keys_str_mv AT alsaeediabdullah improvingsoftwaremodelinferencebycombiningstatemergingandmarkovmodels
_version_ 1718570857550315520
spelling ndltd-bl.uk-oai-ethos.bl.uk-6901592017-12-24T15:45:13ZImproving software model inference by combining state merging and Markov modelsAlsaeedi, AbdullahBogdanov, Kirill2016Labelled-transition systems (LTS) are widely used by developers and testers to model software systems in terms of their sequential behaviour. They provide an overview of the behaviour of the system and their reaction to different inputs. LTS models are the foundation for various automated verification techniques such as model-checking and model-based testing. These techniques require up-to-date models to be meaningful. Unfortunately, software models are rare in practice. Due to the effort and time required to build these models manually, a software engineer would want to infer them automatically from traces (sequences of events or function calls). Many techniques have focused on inferring LTS models from given traces of system execution, where these traces are produced by running a system on a series of tests. State-merging is the foundation of some of the most successful LTS inference techniques to construct LTS models. Passive inference approaches such as k-tail and Evidence-Driven State Merging (EDSM) can infer LTS models from these traces. Moreover, the best-performing methods of inferring LTS models rely on the availability of negatives, i.e. traces that are not permitted from specific states and such information is not usually available. The long-standing challenge for such inference approaches is constructing models well from very few traces and without negatives. Active inference techniques such as Query-driven State Merging (QSM) can learn LTSs from traces by asking queries as tests to a system being learnt. It may lead to infer inaccurate LTSs since the performance of QSM relies on the availability of traces. The challenge for such inference approaches is inferring LTSs well from very few traces and with fewer queries asked. In this thesis, investigations of the existing techniques are presented to the challenge of inferring LTS models from few positive traces. These techniques fail to find correct LTS models in cases of insufficient training data. This thesis focuses on finding better solutions to this problem by using evidence obtained from the Markov models to bias the EDSM learner towards merging states that are more likely to correspond to the same state in a model. Markov models are used to capture the dependencies between event sequences in the collected traces. Those dependencies rely on whether elements of event permitted or prohibited to follow short sequences appear in the traces. This thesis proposed EDSM-Markov a passive inference technique that aimed to improve the existing ones in the absence of negative traces and to prevent the over-generalization problem. In this thesis, improvements obtained by the proposed learners are demonstrated by a series of experiments using randomly-generated labelled-transition systems and case studies. The results obtained from the conducted experiments showed that EDSM-Markov can infer better LTSs compared to other techniques. This thesis also proposes modifications to the QSM learner to improve the accuracy of the inferred LTSs. This results in a new learner, which is named ModifiedQSM. This includes considering more tests to the system being inferred in order to avoid the over-generalization problem. It includes investigations of using Markov models to reduce the number of queries consumed by the ModifiedQSM learner. Hence, this thesis introduces a new LTS inference technique, which is called MarkovQSM. Moreover, enhancements of LTSs inferred by ModifiedQSM and MarkovQSM learners are demonstrated by a series of experiments. The results from the experiments demonstrate that ModifiedQSM can infer better LTSs compared to other techniques. Moreover, MarkovQSM has proven to significantly reduce the number of membership queries consumed compared to ModifiedQSM with a very small loss of accuracy.004University of Sheffieldhttp://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.690159http://etheses.whiterose.ac.uk/13645/Electronic Thesis or Dissertation