Mining mailing lists for content

Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Civil and Environmental Engineering, 2003. === Includes bibliographical references (leaves 65-67). === In large decentralized institutions such as MIT, finding information about events and activities on a campus-wide basis can be a str...

Full description

Bibliographic Details
Main Author:	Harik, Mario A. (Mario Adel), 1980-
Other Authors:	John Williams.
Format:	Others
Language:	English
Published:	Massachusetts Institute of Technology 2006
Subjects:	Civil and Environmental Engineering.
Online Access:	http://hdl.handle.net/1721.1/29557

id	ndltd-MIT-oai-dspace.mit.edu-1721.1-29557
record_format	oai_dc
spelling	ndltd-MIT-oai-dspace.mit.edu-1721.1-295572019-05-02T15:51:53Z Mining mailing lists for content Harik, Mario A. (Mario Adel), 1980- John Williams. Massachusetts Institute of Technology. Dept. of Civil and Environmental Engineering. Massachusetts Institute of Technology. Dept. of Civil and Environmental Engineering. Civil and Environmental Engineering. Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Civil and Environmental Engineering, 2003. Includes bibliographical references (leaves 65-67). In large decentralized institutions such as MIT, finding information about events and activities on a campus-wide basis can be a strenuous task. This is mainly due to the ephemeral nature of events and the inability to impose a centralized information system to all event organizers and target audiences. For the purpose of advertising events, Email is the communication medium of choice. In particular, there is a wide-spread use of electronic mailing lists to publicize events and activities. These can be used as a valuable source for information mining. This dissertation will propose two mining architectures to find category-specific event announcements broadcasted on public MIT mailing lists. At the center of these mining systems is a text classifier that groups Emails based on their textual content. Classification is followed by information extraction where labeled data, such as the event date, is identified and stored along with the Email content in a searchable database. The first architecture is based on a probabilistic classification method, namely naive-Bayes while the second uses a rules-based classifier. A case implementation, FreeFood@MIT, was implemented to expose the results of these classification schemes and is used as a benchmark for recommendations. by Mario A. Harik. M.Eng. 2006-03-24T16:01:59Z 2006-03-24T16:01:59Z 2003 2003 Thesis http://hdl.handle.net/1721.1/29557 52724268 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 81 leaves 4083264 bytes 4083072 bytes application/pdf application/pdf application/pdf Massachusetts Institute of Technology
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Civil and Environmental Engineering.
spellingShingle	Civil and Environmental Engineering. Harik, Mario A. (Mario Adel), 1980- Mining mailing lists for content
description	Thesis (M.Eng.)--Massachusetts Institute of Technology, Dept. of Civil and Environmental Engineering, 2003. === Includes bibliographical references (leaves 65-67). === In large decentralized institutions such as MIT, finding information about events and activities on a campus-wide basis can be a strenuous task. This is mainly due to the ephemeral nature of events and the inability to impose a centralized information system to all event organizers and target audiences. For the purpose of advertising events, Email is the communication medium of choice. In particular, there is a wide-spread use of electronic mailing lists to publicize events and activities. These can be used as a valuable source for information mining. This dissertation will propose two mining architectures to find category-specific event announcements broadcasted on public MIT mailing lists. At the center of these mining systems is a text classifier that groups Emails based on their textual content. Classification is followed by information extraction where labeled data, such as the event date, is identified and stored along with the Email content in a searchable database. The first architecture is based on a probabilistic classification method, namely naive-Bayes while the second uses a rules-based classifier. A case implementation, FreeFood@MIT, was implemented to expose the results of these classification schemes and is used as a benchmark for recommendations. === by Mario A. Harik. === M.Eng.
author2	John Williams.
author_facet	John Williams. Harik, Mario A. (Mario Adel), 1980-
author	Harik, Mario A. (Mario Adel), 1980-
author_sort	Harik, Mario A. (Mario Adel), 1980-
title	Mining mailing lists for content
title_short	Mining mailing lists for content
title_full	Mining mailing lists for content
title_fullStr	Mining mailing lists for content
title_full_unstemmed	Mining mailing lists for content
title_sort	mining mailing lists for content
publisher	Massachusetts Institute of Technology
publishDate	2006
url	http://hdl.handle.net/1721.1/29557
work_keys_str_mv	AT harikmarioamarioadel1980 miningmailinglistsforcontent
_version_	1719029994456350720

Mining mailing lists for content

Similar Items