Information flow identification in large email datasets

Identifying information flow in emails is an important, yet challenging task. In this work we investigate several algorithms for identifying similar sentences in large email datasets, as well as an algorithm for reconstructing threads from unstructured emails. We present a detailed evaluation of eac...

Full description

Bibliographic Details
Main Author: Akuney, Arseniy
Language:English
Published: University of British Columbia 2011
Online Access:http://hdl.handle.net/2429/39847
id ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.-39847
record_format oai_dc
spelling ndltd-LACETR-oai-collectionscanada.gc.ca-BVAU.-398472013-06-05T04:20:13ZInformation flow identification in large email datasetsAkuney, ArseniyIdentifying information flow in emails is an important, yet challenging task. In this work we investigate several algorithms for identifying similar sentences in large email datasets, as well as an algorithm for reconstructing threads from unstructured emails. We present a detailed evaluation of each algorithm in terms of accuracy and time performance. We also investigate the usage of cloud computing in order to increase computational efficiency and make information discovery usable in real time.University of British Columbia2011-12-23T18:13:45Z2011-12-23T18:13:45Z20112011-12-232012-05Electronic Thesis or Dissertationhttp://hdl.handle.net/2429/39847eng
collection NDLTD
language English
sources NDLTD
description Identifying information flow in emails is an important, yet challenging task. In this work we investigate several algorithms for identifying similar sentences in large email datasets, as well as an algorithm for reconstructing threads from unstructured emails. We present a detailed evaluation of each algorithm in terms of accuracy and time performance. We also investigate the usage of cloud computing in order to increase computational efficiency and make information discovery usable in real time.
author Akuney, Arseniy
spellingShingle Akuney, Arseniy
Information flow identification in large email datasets
author_facet Akuney, Arseniy
author_sort Akuney, Arseniy
title Information flow identification in large email datasets
title_short Information flow identification in large email datasets
title_full Information flow identification in large email datasets
title_fullStr Information flow identification in large email datasets
title_full_unstemmed Information flow identification in large email datasets
title_sort information flow identification in large email datasets
publisher University of British Columbia
publishDate 2011
url http://hdl.handle.net/2429/39847
work_keys_str_mv AT akuneyarseniy informationflowidentificationinlargeemaildatasets
_version_ 1716588046167572480