A Process Model Collection and Gold Standard Correspondences for Process Model Matching

Business process models are the conceptual models to depict the workflow of an organization. Process model matching (PMM) refers to the automatic identification of corresponding activities between a pair of process models that show similar or the same behavior. During the last few years, PMM has rec...

Full description

Bibliographic Details
Main Authors: Khurram Shahzad, Rao Muhammad Adeel Nawab, Adnan Abid, Kareem Sharif, Faizan Ali, Faisal Aslam, Arslaan Mazhar
Format: Article
Language:English
Published: IEEE 2019-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8667007/
Description
Summary:Business process models are the conceptual models to depict the workflow of an organization. Process model matching (PMM) refers to the automatic identification of corresponding activities between a pair of process models that show similar or the same behavior. During the last few years, PMM has received much of the researchers' attention due to its wide range of applications, such as clone detection and harmonization of process models. Consequently, a plethora of PMM techniques has been developed. In order to evaluate the effectiveness of these techniques, experts have developed three benchmark datasets, formally called PMMC'15 datasets. Furthermore, the process models in the datasets have been converted into OAEI'17 ontologies. These resources are a valuable asset for the PMM community to evaluate process model matching techniques. However, these resources (PMMC'15 and OAEI'17) are limited to fewer models and a handful collection of corresponding activities among these models that may not be sufficient to rigorously evaluate the PMM techniques. To fill this gap, this paper provides a large, diverse, and a carefully handcrafted collection of process models, along with their benchmark correspondences. The process model collection and benchmark correspondences between these models are freely available for the community [1]. Our newly developed dataset, together with the existing resources, can be used for a thorough evaluation of PMM techniques, especially in the context of the vocabulary mismatch problem. At last, we have evaluated the characteristics of our dataset by a series of experiments while involving widely used similarity measures in PMM research. The results reveal that our dataset is larger, diverse, and challenging as compared to existing datasets in the PMM domain.
ISSN:2169-3536