A customizable grammar-based framework for user-intent text classification

In real-life classification problems, prior information about the problem and expert knowledge about the domain are often used to obtain reliable and consistent solutions. This is especially true in fields where the data is ambiguous, such as text, in which the same words can be used in seemingly si...

Full description

Bibliographic Details
Main Author: Mohasseb, Alaa
Other Authors: Bader-El-Den, Mohamed ; Cocea, Mihaela ; Hadikin, Glenn Stewart
Published: University of Portsmouth 2018
Subjects:
004
Online Access:https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.765694
id ndltd-bl.uk-oai-ethos.bl.uk-765694
record_format oai_dc
spelling ndltd-bl.uk-oai-ethos.bl.uk-7656942019-03-05T15:37:16ZA customizable grammar-based framework for user-intent text classificationMohasseb, AlaaBader-El-Den, Mohamed ; Cocea, Mihaela ; Hadikin, Glenn Stewart2018In real-life classification problems, prior information about the problem and expert knowledge about the domain are often used to obtain reliable and consistent solutions. This is especially true in fields where the data is ambiguous, such as text, in which the same words can be used in seemingly similar texts but have a different meaning. Many of the proposed approaches rely on the bag-of-words representation, which loses the information about the structure of the text. In this thesis, a literature review of related works in text classification is provided which includes an overview of text classification methods. In addition, detailed review of related works of two text classification domains; search engines and question answering systems. The core contribution is divided into three main parts. The first contribution is the Customizable Grammar Framework for user-intent text classification (CGF) which employs a formal grammar approach and exploits domain-related information in a new way to represent text as a series of syntactic categories forming syntactic patterns. In addition, the proposed framework has been applied to different domains which resulted in the second and third contribution. The second contribution is the Grammar-Based Framework for Query Classification (GQC) which helped in the improvement of query identification and classification. The third contribution is the Grammar-Based Framework for Question Categorization and Classification (GQCC) which helped in the enhancement of question identification and classification. In addition, using different machine learning algorithms the overall results show that the proposed approach outperforms previous ones in terms of classification performance for query and question classifications. Finally, comparison of the classification performance with the state-of-the-art approaches has been conducted, results validate that the proposed approach improves the classification accuracy and the identification of the different types of queries and questions.004University of Portsmouthhttps://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.765694https://researchportal.port.ac.uk/portal/en/theses/a-customizable-grammarbased-framework-for-userintent-text-classification(5f5e43d8-dbda-4748-9941-3a6a5d12abf8).htmlElectronic Thesis or Dissertation
collection NDLTD
sources NDLTD
topic 004
spellingShingle 004
Mohasseb, Alaa
A customizable grammar-based framework for user-intent text classification
description In real-life classification problems, prior information about the problem and expert knowledge about the domain are often used to obtain reliable and consistent solutions. This is especially true in fields where the data is ambiguous, such as text, in which the same words can be used in seemingly similar texts but have a different meaning. Many of the proposed approaches rely on the bag-of-words representation, which loses the information about the structure of the text. In this thesis, a literature review of related works in text classification is provided which includes an overview of text classification methods. In addition, detailed review of related works of two text classification domains; search engines and question answering systems. The core contribution is divided into three main parts. The first contribution is the Customizable Grammar Framework for user-intent text classification (CGF) which employs a formal grammar approach and exploits domain-related information in a new way to represent text as a series of syntactic categories forming syntactic patterns. In addition, the proposed framework has been applied to different domains which resulted in the second and third contribution. The second contribution is the Grammar-Based Framework for Query Classification (GQC) which helped in the improvement of query identification and classification. The third contribution is the Grammar-Based Framework for Question Categorization and Classification (GQCC) which helped in the enhancement of question identification and classification. In addition, using different machine learning algorithms the overall results show that the proposed approach outperforms previous ones in terms of classification performance for query and question classifications. Finally, comparison of the classification performance with the state-of-the-art approaches has been conducted, results validate that the proposed approach improves the classification accuracy and the identification of the different types of queries and questions.
author2 Bader-El-Den, Mohamed ; Cocea, Mihaela ; Hadikin, Glenn Stewart
author_facet Bader-El-Den, Mohamed ; Cocea, Mihaela ; Hadikin, Glenn Stewart
Mohasseb, Alaa
author Mohasseb, Alaa
author_sort Mohasseb, Alaa
title A customizable grammar-based framework for user-intent text classification
title_short A customizable grammar-based framework for user-intent text classification
title_full A customizable grammar-based framework for user-intent text classification
title_fullStr A customizable grammar-based framework for user-intent text classification
title_full_unstemmed A customizable grammar-based framework for user-intent text classification
title_sort customizable grammar-based framework for user-intent text classification
publisher University of Portsmouth
publishDate 2018
url https://ethos.bl.uk/OrderDetails.do?uin=uk.bl.ethos.765694
work_keys_str_mv AT mohassebalaa acustomizablegrammarbasedframeworkforuserintenttextclassification
AT mohassebalaa customizablegrammarbasedframeworkforuserintenttextclassification
_version_ 1718995502730575872