Knowledge Acquisition, Delivery and Prediction through Text Mining

The World Wide Web is an abundant source for Textual Web Mining research. Data can be acquired from Web texts and converted to Information or Knowledge for immediate consumption. Studying the acquisition and consumption of Web text can provide a glimpse into the social/behavioral aspects of Web Us...

Full description

Bibliographic Details
Main Author: Schumaker, Robert P.
Other Authors: Chen, Hsinchun
Language:EN
Published: The University of Arizona. 2007
Online Access:http://hdl.handle.net/10150/194680
Description
Summary:The World Wide Web is an abundant source for Textual Web Mining research. Data can be acquired from Web texts and converted to Information or Knowledge for immediate consumption. Studying the acquisition and consumption of Web text can provide a glimpse into the social/behavioral aspects of Web Users and Web Content Providers. Patterns embedded within textual data can be similarly identified through technical means and even anticipated.Seven essays explore the important algorithmic and computational aspects needed in the analysis of acquiring, delivering and making predictions from Web texts. Chapters 2 and 3 describe the knowledge acquisition process and feasibility of leveraging Web users. While the knowledge acquired from Web users was not as refined as that from domain experts, the knowledge gathered was found to be of acceptable quality. From our analysis of dialog systems, it was found that Web users were more likely to augment the breadth of existing knowledge by adding new response sets to the knowledge base. Chapters 4 and 5 look at the aspects of knowledge delivery to Web users. Using a dialog system, we observe the acceptance and satisfaction levels of dialog responses in general conversation, domain knowledge and the combination of both knowledge bases. Chapters 6 through 8 consider the prediction facet of knowledge using textual financial news articles and stock prices. This section focuses on comparing different model parameters and textual representations to best describe future prices as well as an examination of document representation based on the sector and industry a company is engaged in. From these analyses we found that Sector-based aggregation led to the best price predictions.Together these essays effectively leverage large amounts of textual Web data to represent knowledge in meaningful ways to end users. These essays also provide the blueprints for several real-world applications. The approaches and techniques described borrow from referent disciplines of linguistics, finance, computer science, statistics as well as MIS and demonstrate potentially useful applications for dialog systems, quantitative stock prediction and other knowledge management processes in which textual data can be accurately represented and forecast; thus improving the exchange of human knowledge.