Unsupervised syntactic category learning from child-directed speech

Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010. === Cataloged from PDF version of thesis. === Includes bibliographical references (p. 57-59). === The goal of this research was to discover what kinds of syntactic categories can be l...

Full description

Bibliographic Details
Main Author:	Wichrowska, Olga N
Other Authors:	Robert C. Berwick.
Format:	Others
Language:	English
Published:	Massachusetts Institute of Technology 2011
Subjects:	Electrical Engineering and Computer Science.
Online Access:	http://hdl.handle.net/1721.1/62756

id	ndltd-MIT-oai-dspace.mit.edu-1721.1-62756
record_format	oai_dc
spelling	ndltd-MIT-oai-dspace.mit.edu-1721.1-627562019-05-02T16:26:24Z Unsupervised syntactic category learning from child-directed speech Wichrowska, Olga N Robert C. Berwick. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Massachusetts Institute of Technology. Dept. of Electrical Engineering and Computer Science. Electrical Engineering and Computer Science. Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010. Cataloged from PDF version of thesis. Includes bibliographical references (p. 57-59). The goal of this research was to discover what kinds of syntactic categories can be learned using distributional analysis on linear context of words, specifically in child-directed speech. The idea behind this is that the categories used by children could very well be different from adult categories. There is some evidence that distributional analysis could be used for some aspects of language acquisition, though very strong arguments exist for why it is not enough to acquire grammar. These experiments can help identify what kind of data can be learned from linear context and statistics only. This paper reports the results of three established automatic syntactic category learning algorithms on a small, edited input set of child-directed speech from the CHILDES database. Hierarchical clustering, K-Means analysis, and an implementation of a substitution algorithm are all used to assign syntactic categories to words based on their linear distributional context. Overall, open classes (nouns, verbs, adjectives) were reliably categorized, and some methods were able to distinguish prepositions, adverbs, subjects vs. objects, and verbs by subcategorization frame. The main barrier standing between these methods and human-like categorization is the inability to deal with the ambiguity that is omnipresent in natural language and poses an important problem for future models of syntactic category acquisition. by Olga N. Wichrowska. M.Eng. 2011-05-09T15:30:47Z 2011-05-09T15:30:47Z 2010 2010 Thesis http://hdl.handle.net/1721.1/62756 717716094 eng M.I.T. theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. See provided URL for inquiries about permission. http://dspace.mit.edu/handle/1721.1/7582 59 p. application/pdf Massachusetts Institute of Technology
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Electrical Engineering and Computer Science.
spellingShingle	Electrical Engineering and Computer Science. Wichrowska, Olga N Unsupervised syntactic category learning from child-directed speech
description	Thesis (M. Eng.)--Massachusetts Institute of Technology, Dept. of Electrical Engineering and Computer Science, 2010. === Cataloged from PDF version of thesis. === Includes bibliographical references (p. 57-59). === The goal of this research was to discover what kinds of syntactic categories can be learned using distributional analysis on linear context of words, specifically in child-directed speech. The idea behind this is that the categories used by children could very well be different from adult categories. There is some evidence that distributional analysis could be used for some aspects of language acquisition, though very strong arguments exist for why it is not enough to acquire grammar. These experiments can help identify what kind of data can be learned from linear context and statistics only. This paper reports the results of three established automatic syntactic category learning algorithms on a small, edited input set of child-directed speech from the CHILDES database. Hierarchical clustering, K-Means analysis, and an implementation of a substitution algorithm are all used to assign syntactic categories to words based on their linear distributional context. Overall, open classes (nouns, verbs, adjectives) were reliably categorized, and some methods were able to distinguish prepositions, adverbs, subjects vs. objects, and verbs by subcategorization frame. The main barrier standing between these methods and human-like categorization is the inability to deal with the ambiguity that is omnipresent in natural language and poses an important problem for future models of syntactic category acquisition. === by Olga N. Wichrowska. === M.Eng.
author2	Robert C. Berwick.
author_facet	Robert C. Berwick. Wichrowska, Olga N
author	Wichrowska, Olga N
author_sort	Wichrowska, Olga N
title	Unsupervised syntactic category learning from child-directed speech
title_short	Unsupervised syntactic category learning from child-directed speech
title_full	Unsupervised syntactic category learning from child-directed speech
title_fullStr	Unsupervised syntactic category learning from child-directed speech
title_full_unstemmed	Unsupervised syntactic category learning from child-directed speech
title_sort	unsupervised syntactic category learning from child-directed speech
publisher	Massachusetts Institute of Technology
publishDate	2011
url	http://hdl.handle.net/1721.1/62756
work_keys_str_mv	AT wichrowskaolgan unsupervisedsyntacticcategorylearningfromchilddirectedspeech
_version_	1719040960624590848

Unsupervised syntactic category learning from child-directed speech

Similar Items