Development of FAQ-master as a New Intelligent Web Information System

博士 === 國立臺灣科技大學 === 電子工程系 === 94 === This thesis describes the result of our research in developing FAQ-master as an intelligent Web information system. The system is developed to perform intelligent discovery, retrieval, filtering, proxy, ranking and presentation of Web information to provide high-...

Full description

Bibliographic Details
Main Authors: Sheng-Yuan Yang, 楊勝源
Other Authors: Cheng-Seen Ho
Format: Others
Language:en_US
Published: 2005
Online Access:http://ndltd.ncl.edu.tw/handle/21027107311457073775
id ndltd-TW-094NTUST428165
record_format oai_dc
collection NDLTD
language en_US
format Others
sources NDLTD
description 博士 === 國立臺灣科技大學 === 電子工程系 === 94 === This thesis describes the result of our research in developing FAQ-master as an intelligent Web information system. The system is developed to perform intelligent discovery, retrieval, filtering, proxy, ranking and presentation of Web information to provide high-quality FAQ solutions to meet user information request. By a high quality answer we mean an answer that is profound, up-to-date, and relevant to the user’s question. We summarized problems into: how to faithfully capture user intention, how to effectively discover and aggregate Web information, how to present the relevant result to the user, and how to provide efficient proxy mechanism to help speed up the turn around time. We propose the following techniques to tackle the above issues: ontology, user models, website models, and data aggregation and proxy mechanisms. Based upon the techniques, FAQ-master was developed to contain four agents, namely, Interface Agent, Proxy Agent, Answerer Agent, and Search Agent, which can effectively and efficiently improve the search result from the following three aspects of the Web search activity, namely, user intention, document processing, and website search. The Interface Agent was developed to work as an assistant between the user and FAQ system for capturing true user’s intention. Based on user modeling, template-based and ontology-supported techniques, the agent can support natural language query, enhanced by the pattern-match and template-based technique; assistance and guidance for human-machine interaction; and better personalized information services. It also handles user feedback on the suitability of the proposed responses. The Proxy Agent was developed to work as a two-tier mediator between the Interface Agent and backend Answerer Agent. It employs an ontology-enhanced intelligent proxy mechanism to effectively alleviate the overloading problem usually associated with a backend server. The Answerer Agent was developed to help clean, retrieve, and transform FAQ information collected from a heterogeneous environment, such as the Web, and stores it in an ontological database. It works as a back end process to perform ontology-directed information aggregation, supported by the wrapper technique, from the webpages collected by the Search Agent. Finally, the Search Agent was developed to work as an both user-oriented and domain-related Web information retrieval with the help of ontology-supported website models. This approach provides a semantic level solution for the Search Agent so that it can provide domain-specific, focused Web information discovery toward a high degree of user satisfaction. Our first contribution is on the techniques of user modeling and query processing involved in the development of Interface Agent, which features ontology-supported, template-based user modeling technique and query processing. Our preliminary experimentation demonstrates that user intention and focus of up to eighty percent of the user queries can be correctly understood by the system. In addition, from the experiments we verify the robustness of the linguistic pattern match technique by demonstrating its effectiveness in analyzing users’ query intention and focus. The second contribution is on the techniques of query prediction in Proxy Agent. The agent features following interesting points. First, it performs fast user-oriented mining and prediction by discovering frequent queries and predicted queries from user query history. The improved sequential pattern mining algorithm is made more efficient by the techniques of perfect hashing and database decomposition. Second, it performs ontology-directed case-based reasoning. The semantics of PC ontology, in particular the VRelationships, are used in determining similar cases, performing case adaptation, and case retaining. Our experiments show that the agent can share up to 70% of the query loading from the backend process, which helps a lot on the overall query performance. The third contribution is on the techniques of organizing and processing unstructured Web information in Answerer Agent. The agent employs ontology as the key technique, supported by the wrapper techniques to help clean, retrieve, and transform unstructured FAQ information collected from a heterogeneous environment, and stores it in an ontological database, which reflects the ontological structure. When it comes to the retrieval of FAQs, the agent trims irrelevant query keywords, employs either full keywords or partial keywords to retrieve FAQs, and removes conflicting FAQs before turning the final results to the user, all of which are supported by ontology. In addition, to producing a more effective presentation of the search results, the agent employs an enhanced ranking technique, which includes Appearance Probability, Satisfaction Value, Compatibility Value, and Statistic Similarity Value as four measures with proper weights to rank the FAQs. Our experiments show the Agent does improve the precision rate and produces better ranking results. The final contribution is on the techniques of reflecting both user-oriented and domain-focused aspects in web search in Search Agent. The agent features an ontology-supported website modeling technique to provide a semantic level solution for a search engine so that it can provide fast, precise and stable search results with a high degree of user satisfaction. The website modeling technique closely connected to the domain ontology, which supports the following functions in both website model construction and application: query expansion, webpage annotation, webpage/website classification, and focused collection of domain-related and user-interested Web resources. The agent features the following interesting characteristics. 1) Ontology-supported construction of website models. By this, we attribute domain semantics into the Web resources collected and stored in the local database. One important contribution here is the new Ontology-supported OntoClassifier which can do very accurate and stable classification on webpages to support more correct annotation of domain semantics. Our experiments show that Ontoclassifier performs very well in obtaining accurate and stable webpages classification. 2) Website models-supported web resource discovery. By this, we take into account both user interests and domain specificity. The contribution here is the new Focused Crawler which employs progressive strategies to do user query-driven webpage expansion, autonomous website expansion, and query results exploitation to effectively expand the website models. 3) Website models-supported Webpage Retrieval. By this, we leverage the power of ontology features as a fast index structure to locate most-wanted webpages for the user.
author2 Cheng-Seen Ho
author_facet Cheng-Seen Ho
Sheng-Yuan Yang
楊勝源
author Sheng-Yuan Yang
楊勝源
spellingShingle Sheng-Yuan Yang
楊勝源
Development of FAQ-master as a New Intelligent Web Information System
author_sort Sheng-Yuan Yang
title Development of FAQ-master as a New Intelligent Web Information System
title_short Development of FAQ-master as a New Intelligent Web Information System
title_full Development of FAQ-master as a New Intelligent Web Information System
title_fullStr Development of FAQ-master as a New Intelligent Web Information System
title_full_unstemmed Development of FAQ-master as a New Intelligent Web Information System
title_sort development of faq-master as a new intelligent web information system
publishDate 2005
url http://ndltd.ncl.edu.tw/handle/21027107311457073775
work_keys_str_mv AT shengyuanyang developmentoffaqmasterasanewintelligentwebinformationsystem
AT yángshèngyuán developmentoffaqmasterasanewintelligentwebinformationsystem
AT shengyuanyang xīnyīdàizhìhuìxíngwǎnglùzīxùnxìtǒngfaqmaster
AT yángshèngyuán xīnyīdàizhìhuìxíngwǎnglùzīxùnxìtǒngfaqmaster
_version_ 1718156843329519616
spelling ndltd-TW-094NTUST4281652015-12-23T04:08:14Z http://ndltd.ncl.edu.tw/handle/21027107311457073775 Development of FAQ-master as a New Intelligent Web Information System 新一代智慧型網路資訊系統FAQ-master Sheng-Yuan Yang 楊勝源 博士 國立臺灣科技大學 電子工程系 94 This thesis describes the result of our research in developing FAQ-master as an intelligent Web information system. The system is developed to perform intelligent discovery, retrieval, filtering, proxy, ranking and presentation of Web information to provide high-quality FAQ solutions to meet user information request. By a high quality answer we mean an answer that is profound, up-to-date, and relevant to the user’s question. We summarized problems into: how to faithfully capture user intention, how to effectively discover and aggregate Web information, how to present the relevant result to the user, and how to provide efficient proxy mechanism to help speed up the turn around time. We propose the following techniques to tackle the above issues: ontology, user models, website models, and data aggregation and proxy mechanisms. Based upon the techniques, FAQ-master was developed to contain four agents, namely, Interface Agent, Proxy Agent, Answerer Agent, and Search Agent, which can effectively and efficiently improve the search result from the following three aspects of the Web search activity, namely, user intention, document processing, and website search. The Interface Agent was developed to work as an assistant between the user and FAQ system for capturing true user’s intention. Based on user modeling, template-based and ontology-supported techniques, the agent can support natural language query, enhanced by the pattern-match and template-based technique; assistance and guidance for human-machine interaction; and better personalized information services. It also handles user feedback on the suitability of the proposed responses. The Proxy Agent was developed to work as a two-tier mediator between the Interface Agent and backend Answerer Agent. It employs an ontology-enhanced intelligent proxy mechanism to effectively alleviate the overloading problem usually associated with a backend server. The Answerer Agent was developed to help clean, retrieve, and transform FAQ information collected from a heterogeneous environment, such as the Web, and stores it in an ontological database. It works as a back end process to perform ontology-directed information aggregation, supported by the wrapper technique, from the webpages collected by the Search Agent. Finally, the Search Agent was developed to work as an both user-oriented and domain-related Web information retrieval with the help of ontology-supported website models. This approach provides a semantic level solution for the Search Agent so that it can provide domain-specific, focused Web information discovery toward a high degree of user satisfaction. Our first contribution is on the techniques of user modeling and query processing involved in the development of Interface Agent, which features ontology-supported, template-based user modeling technique and query processing. Our preliminary experimentation demonstrates that user intention and focus of up to eighty percent of the user queries can be correctly understood by the system. In addition, from the experiments we verify the robustness of the linguistic pattern match technique by demonstrating its effectiveness in analyzing users’ query intention and focus. The second contribution is on the techniques of query prediction in Proxy Agent. The agent features following interesting points. First, it performs fast user-oriented mining and prediction by discovering frequent queries and predicted queries from user query history. The improved sequential pattern mining algorithm is made more efficient by the techniques of perfect hashing and database decomposition. Second, it performs ontology-directed case-based reasoning. The semantics of PC ontology, in particular the VRelationships, are used in determining similar cases, performing case adaptation, and case retaining. Our experiments show that the agent can share up to 70% of the query loading from the backend process, which helps a lot on the overall query performance. The third contribution is on the techniques of organizing and processing unstructured Web information in Answerer Agent. The agent employs ontology as the key technique, supported by the wrapper techniques to help clean, retrieve, and transform unstructured FAQ information collected from a heterogeneous environment, and stores it in an ontological database, which reflects the ontological structure. When it comes to the retrieval of FAQs, the agent trims irrelevant query keywords, employs either full keywords or partial keywords to retrieve FAQs, and removes conflicting FAQs before turning the final results to the user, all of which are supported by ontology. In addition, to producing a more effective presentation of the search results, the agent employs an enhanced ranking technique, which includes Appearance Probability, Satisfaction Value, Compatibility Value, and Statistic Similarity Value as four measures with proper weights to rank the FAQs. Our experiments show the Agent does improve the precision rate and produces better ranking results. The final contribution is on the techniques of reflecting both user-oriented and domain-focused aspects in web search in Search Agent. The agent features an ontology-supported website modeling technique to provide a semantic level solution for a search engine so that it can provide fast, precise and stable search results with a high degree of user satisfaction. The website modeling technique closely connected to the domain ontology, which supports the following functions in both website model construction and application: query expansion, webpage annotation, webpage/website classification, and focused collection of domain-related and user-interested Web resources. The agent features the following interesting characteristics. 1) Ontology-supported construction of website models. By this, we attribute domain semantics into the Web resources collected and stored in the local database. One important contribution here is the new Ontology-supported OntoClassifier which can do very accurate and stable classification on webpages to support more correct annotation of domain semantics. Our experiments show that Ontoclassifier performs very well in obtaining accurate and stable webpages classification. 2) Website models-supported web resource discovery. By this, we take into account both user interests and domain specificity. The contribution here is the new Focused Crawler which employs progressive strategies to do user query-driven webpage expansion, autonomous website expansion, and query results exploitation to effectively expand the website models. 3) Website models-supported Webpage Retrieval. By this, we leverage the power of ontology features as a fast index structure to locate most-wanted webpages for the user. Cheng-Seen Ho 何正信 2005 學位論文 ; thesis 121 en_US