Incident threading in news
With an overwhelming volume of news reports currently available, there is an increasing need for automatic techniques to analyze and present news to a general reader in a meaningful and efficient manner. Previous research has focused primarily on organizing news stories into a list of clusters by th...
Main Author: | |
---|---|
Language: | ENG |
Published: |
ScholarWorks@UMass Amherst
2008
|
Subjects: | |
Online Access: | https://scholarworks.umass.edu/dissertations/AAI3337013 |
id |
ndltd-UMASS-oai-scholarworks.umass.edu-dissertations-5278 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UMASS-oai-scholarworks.umass.edu-dissertations-52782020-12-02T14:35:02Z Incident threading in news Feng, Ao With an overwhelming volume of news reports currently available, there is an increasing need for automatic techniques to analyze and present news to a general reader in a meaningful and efficient manner. Previous research has focused primarily on organizing news stories into a list of clusters by the main topics that they discuss. We believe that viewing a news topic as a simple collection of stories is restrictive and inefficient for a user hoping to understand the information quickly. As a proposed solution to the automatic news organization problem, we introduce incident threading in this thesis. All text that describes the occurrence of a real-world happening is merged into a news incident, and incidents are organized in a network with dependencies of predefined types. In order to simplify the implementation, we start with the common assumption that a news story is coherent in content. In the story threading system, a cluster of news documents discussing the same topic are further grouped into smaller sets, where each represents a separate news event. Binary links are established to reflect the contextual information among those events. Experiments in story threading show promising results. We next describe an enhanced version called relation-oriented story threading that extends the range of the prior work by assigning type labels to the links and describing the relation within each story pair as a competitive process among multiple options. The quality of links is greatly improved with a global optimization process. Our final approach, passage threading, removes the story-coherence assumption by conducting passage-level processing of news. First we develop a new testbed for this research and extend the evaluation methods to address new issues. Next, a calibration study demonstrates that an incident network helps reading comprehension with an accuracy of 25-30% in a matrix comparison evaluation. Then a new three-stage algorithm is described that identifies on-subject passages, groups them into incidents, and establishes links between related incidents. Finally, significant improvement over earlier work is observed when the training phase optimizes the harmonic mean of various evaluation measures, and the performance meets the goal in the calibration study. 2008-01-01T08:00:00Z text https://scholarworks.umass.edu/dissertations/AAI3337013 Doctoral Dissertations Available from Proquest ENG ScholarWorks@UMass Amherst Information science|Computer science |
collection |
NDLTD |
language |
ENG |
sources |
NDLTD |
topic |
Information science|Computer science |
spellingShingle |
Information science|Computer science Feng, Ao Incident threading in news |
description |
With an overwhelming volume of news reports currently available, there is an increasing need for automatic techniques to analyze and present news to a general reader in a meaningful and efficient manner. Previous research has focused primarily on organizing news stories into a list of clusters by the main topics that they discuss. We believe that viewing a news topic as a simple collection of stories is restrictive and inefficient for a user hoping to understand the information quickly. As a proposed solution to the automatic news organization problem, we introduce incident threading in this thesis. All text that describes the occurrence of a real-world happening is merged into a news incident, and incidents are organized in a network with dependencies of predefined types. In order to simplify the implementation, we start with the common assumption that a news story is coherent in content. In the story threading system, a cluster of news documents discussing the same topic are further grouped into smaller sets, where each represents a separate news event. Binary links are established to reflect the contextual information among those events. Experiments in story threading show promising results. We next describe an enhanced version called relation-oriented story threading that extends the range of the prior work by assigning type labels to the links and describing the relation within each story pair as a competitive process among multiple options. The quality of links is greatly improved with a global optimization process. Our final approach, passage threading, removes the story-coherence assumption by conducting passage-level processing of news. First we develop a new testbed for this research and extend the evaluation methods to address new issues. Next, a calibration study demonstrates that an incident network helps reading comprehension with an accuracy of 25-30% in a matrix comparison evaluation. Then a new three-stage algorithm is described that identifies on-subject passages, groups them into incidents, and establishes links between related incidents. Finally, significant improvement over earlier work is observed when the training phase optimizes the harmonic mean of various evaluation measures, and the performance meets the goal in the calibration study. |
author |
Feng, Ao |
author_facet |
Feng, Ao |
author_sort |
Feng, Ao |
title |
Incident threading in news |
title_short |
Incident threading in news |
title_full |
Incident threading in news |
title_fullStr |
Incident threading in news |
title_full_unstemmed |
Incident threading in news |
title_sort |
incident threading in news |
publisher |
ScholarWorks@UMass Amherst |
publishDate |
2008 |
url |
https://scholarworks.umass.edu/dissertations/AAI3337013 |
work_keys_str_mv |
AT fengao incidentthreadinginnews |
_version_ |
1719364852796882944 |