Information flow identification in large email datasets
Identifying information flow in emails is an important, yet challenging task. In this work we investigate several algorithms for identifying similar sentences in large email datasets, as well as an algorithm for reconstructing threads from unstructured emails. We present a detailed evaluation of eac...
Main Author: | |
---|---|
Language: | English |
Published: |
University of British Columbia
2011
|
Online Access: | http://hdl.handle.net/2429/39847 |
id |
ndltd-UBC-oai-circle.library.ubc.ca-2429-39847 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UBC-oai-circle.library.ubc.ca-2429-398472018-01-05T17:25:33Z Information flow identification in large email datasets Akuney, Arseniy Identifying information flow in emails is an important, yet challenging task. In this work we investigate several algorithms for identifying similar sentences in large email datasets, as well as an algorithm for reconstructing threads from unstructured emails. We present a detailed evaluation of each algorithm in terms of accuracy and time performance. We also investigate the usage of cloud computing in order to increase computational efficiency and make information discovery usable in real time. Science, Faculty of Computer Science, Department of Graduate 2011-12-23T18:13:45Z 2011-12-23T18:13:45Z 2011 2012-05 Text Thesis/Dissertation http://hdl.handle.net/2429/39847 eng Attribution-NonCommercial-NoDerivatives 4.0 International http://creativecommons.org/licenses/by-nc-nd/4.0/ University of British Columbia |
collection |
NDLTD |
language |
English |
sources |
NDLTD |
description |
Identifying information flow in emails is an important, yet challenging task. In this work we investigate several algorithms for identifying similar sentences in large email datasets, as well as an algorithm for reconstructing threads from unstructured emails. We present a detailed evaluation of each algorithm in terms of accuracy and time performance. We also investigate the usage of cloud computing in order to increase computational efficiency and make information discovery usable in real time. === Science, Faculty of === Computer Science, Department of === Graduate |
author |
Akuney, Arseniy |
spellingShingle |
Akuney, Arseniy Information flow identification in large email datasets |
author_facet |
Akuney, Arseniy |
author_sort |
Akuney, Arseniy |
title |
Information flow identification in large email datasets |
title_short |
Information flow identification in large email datasets |
title_full |
Information flow identification in large email datasets |
title_fullStr |
Information flow identification in large email datasets |
title_full_unstemmed |
Information flow identification in large email datasets |
title_sort |
information flow identification in large email datasets |
publisher |
University of British Columbia |
publishDate |
2011 |
url |
http://hdl.handle.net/2429/39847 |
work_keys_str_mv |
AT akuneyarseniy informationflowidentificationinlargeemaildatasets |
_version_ |
1718583168291831808 |