A Study of Text Mining Framework for Automated Classification of Software Requirements in Enterprise Systems

abstract: Text Classification is a rapidly evolving area of Data Mining while Requirements Engineering is a less-explored area of Software Engineering which deals the process of defining, documenting and maintaining a software system's requirements. When researchers decided to blend these two s...

Full description

Bibliographic Details
Other Authors: Swadia, Japa Nimish (Author)
Format: Dissertation
Language:English
Published: 2016
Subjects:
R
Online Access:http://hdl.handle.net/2286/R.I.38809
id ndltd-asu.edu-item-38809
record_format oai_dc
spelling ndltd-asu.edu-item-388092018-06-22T03:07:34Z A Study of Text Mining Framework for Automated Classification of Software Requirements in Enterprise Systems abstract: Text Classification is a rapidly evolving area of Data Mining while Requirements Engineering is a less-explored area of Software Engineering which deals the process of defining, documenting and maintaining a software system's requirements. When researchers decided to blend these two streams in, there was research on automating the process of classification of software requirements statements into categories easily comprehensible for developers for faster development and delivery, which till now was mostly done manually by software engineers - indeed a tedious job. However, most of the research was focused on classification of Non-functional requirements pertaining to intangible features such as security, reliability, quality and so on. It is indeed a challenging task to automatically classify functional requirements, those pertaining to how the system will function, especially those belonging to different and large enterprise systems. This requires exploitation of text mining capabilities. This thesis aims to investigate results of text classification applied on functional software requirements by creating a framework in R and making use of algorithms and techniques like k-nearest neighbors, support vector machine, and many others like boosting, bagging, maximum entropy, neural networks and random forests in an ensemble approach. The study was conducted by collecting and visualizing relevant enterprise data manually classified previously and subsequently used for training the model. Key components for training included frequency of terms in the documents and the level of cleanliness of data. The model was applied on test data and validated for analysis, by studying and comparing parameters like precision, recall and accuracy. Dissertation/Thesis Swadia, Japa Nimish (Author) Ghazarian, Arbi (Advisor) Bansal, Srividya (Committee member) Gaffar, Ashraf (Committee member) Arizona State University (Publisher) Computer science Engineering data analytics R requirements classification text classification text mining eng 61 pages Masters Thesis Engineering 2016 Masters Thesis http://hdl.handle.net/2286/R.I.38809 http://rightsstatements.org/vocab/InC/1.0/ All Rights Reserved 2016
collection NDLTD
language English
format Dissertation
sources NDLTD
topic Computer science
Engineering
data analytics
R
requirements classification
text classification
text mining
spellingShingle Computer science
Engineering
data analytics
R
requirements classification
text classification
text mining
A Study of Text Mining Framework for Automated Classification of Software Requirements in Enterprise Systems
description abstract: Text Classification is a rapidly evolving area of Data Mining while Requirements Engineering is a less-explored area of Software Engineering which deals the process of defining, documenting and maintaining a software system's requirements. When researchers decided to blend these two streams in, there was research on automating the process of classification of software requirements statements into categories easily comprehensible for developers for faster development and delivery, which till now was mostly done manually by software engineers - indeed a tedious job. However, most of the research was focused on classification of Non-functional requirements pertaining to intangible features such as security, reliability, quality and so on. It is indeed a challenging task to automatically classify functional requirements, those pertaining to how the system will function, especially those belonging to different and large enterprise systems. This requires exploitation of text mining capabilities. This thesis aims to investigate results of text classification applied on functional software requirements by creating a framework in R and making use of algorithms and techniques like k-nearest neighbors, support vector machine, and many others like boosting, bagging, maximum entropy, neural networks and random forests in an ensemble approach. The study was conducted by collecting and visualizing relevant enterprise data manually classified previously and subsequently used for training the model. Key components for training included frequency of terms in the documents and the level of cleanliness of data. The model was applied on test data and validated for analysis, by studying and comparing parameters like precision, recall and accuracy. === Dissertation/Thesis === Masters Thesis Engineering 2016
author2 Swadia, Japa Nimish (Author)
author_facet Swadia, Japa Nimish (Author)
title A Study of Text Mining Framework for Automated Classification of Software Requirements in Enterprise Systems
title_short A Study of Text Mining Framework for Automated Classification of Software Requirements in Enterprise Systems
title_full A Study of Text Mining Framework for Automated Classification of Software Requirements in Enterprise Systems
title_fullStr A Study of Text Mining Framework for Automated Classification of Software Requirements in Enterprise Systems
title_full_unstemmed A Study of Text Mining Framework for Automated Classification of Software Requirements in Enterprise Systems
title_sort study of text mining framework for automated classification of software requirements in enterprise systems
publishDate 2016
url http://hdl.handle.net/2286/R.I.38809
_version_ 1718701175878975488