Updateable PAT-Tree Approach to Chinese Key Phrase Extraction using Mutual Information: A Linguistic Foundation for Knowledge Management

Artificial Intelligence Lab, Department of MIS, University of Arizona === There has been renewed research interest in using the statistical approach to extraction of key phrases from Chinese documents because existing approaches do not allow online frequency updates after phrases have been extract...

Full description

Bibliographic Details
Main Authors: Ong, Thian-Huat, Chen, Hsinchun
Language:en
Published: 1999
Subjects:
Online Access:http://hdl.handle.net/10150/105216
Description
Summary:Artificial Intelligence Lab, Department of MIS, University of Arizona === There has been renewed research interest in using the statistical approach to extraction of key phrases from Chinese documents because existing approaches do not allow online frequency updates after phrases have been extracted. This consequently results in inaccurate, partial extraction. In this paper, we present an updateable PAT-tree approach. In our experiment, we compared our approach with that of Lee-Feng Chien with that showed an improvement in recall from 0.19 to 0.43 and in precision from 0.52 to 0.70. This paper also reviews the requirements for a data structure that facilitates implementation of any statistical approaches to key-phrase extraction, including PATtree, PAT-array and suffix array with semi-infinite strings.