Incorporating Content Structure into Text Analysis Applications
URL to papers listed on conference site
Main Authors: | , , |
---|---|
Other Authors: | |
Format: | Article |
Language: | English |
Published: |
Association for Computational Linguistics,
2011-04-19T18:21:46Z.
|
Subjects: | |
Online Access: | Get fulltext |
Summary: | URL to papers listed on conference site Information about the content structure of a document is largely ignored by current text analysis applications such as information extraction and sentiment analysis. This stands in contrast to the linguistic intuition that rich contextual information should benefit such applications. We present a framework which combines a supervised text analysis application with the induction of latent content structure. Both of these elements are learned jointly using the EM algorithm. The induced content structure is learned from a large unannotated corpus and biased by the underlying text analysis task. We demonstrate that exploiting content structure yields significant improvements over approaches that rely only on local context. |
---|