Effect of ontology hierarchy on a concept vector machine's ability to classify web documents

As the quantity of text documents created on the web grows the ability of experts to manually classify them has decreased. Because people need to find and organize this information, interest has grown in developing automatic means of categorizing these documents. In this effort, ontologies have been...

Full description

Bibliographic Details
Main Author: Graham, Jeffrey A.
Format: Others
Published: NSUWorks 2009
Subjects:
Online Access:http://nsuworks.nova.edu/gscis_etd/165
http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1164&context=gscis_etd
id ndltd-nova.edu-oai-nsuworks.nova.edu-gscis_etd-1164
record_format oai_dc
spelling ndltd-nova.edu-oai-nsuworks.nova.edu-gscis_etd-11642016-10-20T03:58:58Z Effect of ontology hierarchy on a concept vector machine's ability to classify web documents Graham, Jeffrey A. As the quantity of text documents created on the web grows the ability of experts to manually classify them has decreased. Because people need to find and organize this information, interest has grown in developing automatic means of categorizing these documents. In this effort, ontologies have been developed that capture domain specific knowledge in the form of a hierarchy of concepts. Support Vector Machines are machine learning methods that are widely used for automated document categorization. Recent studies suggest that the classification accuracy of a Support Vector Machine may be improved by using concepts defined by a domain ontology instead of using the words that appear in the document. However, such studies have not taken into account the hierarchy inherent in the relationship between concepts. The goal of this dissertation was to investigate whether the hierarchical relationships among concepts in ontologies can be exploited to improve the classification accuracy of web documents by a Support Vector Machine. Concept vectors that capture the hierarchy of domain ontologies were created and used to train a Support Vector Machine. Tests conducted using the benchmark Reuters-21578 data set indicate that the Support Vector Machines achieve higher classification accuracy when they make use of the hierarchical relationships among concepts in ontologies. 2009-01-01T08:00:00Z text application/pdf http://nsuworks.nova.edu/gscis_etd/165 http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1164&context=gscis_etd CEC Theses and Dissertations NSUWorks Concept vector Hierarchy Ontology Reuters-21578 Support Vector Machine Text classification Computer Sciences
collection NDLTD
format Others
sources NDLTD
topic Concept vector
Hierarchy
Ontology
Reuters-21578
Support Vector Machine
Text classification
Computer Sciences
spellingShingle Concept vector
Hierarchy
Ontology
Reuters-21578
Support Vector Machine
Text classification
Computer Sciences
Graham, Jeffrey A.
Effect of ontology hierarchy on a concept vector machine's ability to classify web documents
description As the quantity of text documents created on the web grows the ability of experts to manually classify them has decreased. Because people need to find and organize this information, interest has grown in developing automatic means of categorizing these documents. In this effort, ontologies have been developed that capture domain specific knowledge in the form of a hierarchy of concepts. Support Vector Machines are machine learning methods that are widely used for automated document categorization. Recent studies suggest that the classification accuracy of a Support Vector Machine may be improved by using concepts defined by a domain ontology instead of using the words that appear in the document. However, such studies have not taken into account the hierarchy inherent in the relationship between concepts. The goal of this dissertation was to investigate whether the hierarchical relationships among concepts in ontologies can be exploited to improve the classification accuracy of web documents by a Support Vector Machine. Concept vectors that capture the hierarchy of domain ontologies were created and used to train a Support Vector Machine. Tests conducted using the benchmark Reuters-21578 data set indicate that the Support Vector Machines achieve higher classification accuracy when they make use of the hierarchical relationships among concepts in ontologies.
author Graham, Jeffrey A.
author_facet Graham, Jeffrey A.
author_sort Graham, Jeffrey A.
title Effect of ontology hierarchy on a concept vector machine's ability to classify web documents
title_short Effect of ontology hierarchy on a concept vector machine's ability to classify web documents
title_full Effect of ontology hierarchy on a concept vector machine's ability to classify web documents
title_fullStr Effect of ontology hierarchy on a concept vector machine's ability to classify web documents
title_full_unstemmed Effect of ontology hierarchy on a concept vector machine's ability to classify web documents
title_sort effect of ontology hierarchy on a concept vector machine's ability to classify web documents
publisher NSUWorks
publishDate 2009
url http://nsuworks.nova.edu/gscis_etd/165
http://nsuworks.nova.edu/cgi/viewcontent.cgi?article=1164&context=gscis_etd
work_keys_str_mv AT grahamjeffreya effectofontologyhierarchyonaconceptvectormachinesabilitytoclassifywebdocuments
_version_ 1718387617385414656