Generalized expectation criteria for lightly supervised learning

Machine learning has facilitated many recent advances in natural language processing and information extraction. Unfortunately, most machine learning methods rely on costly labeled data, which impedes their application to new problems. Even in the absence of labeled data we often have a wealth of pr...

Full description

Bibliographic Details
Main Author: Druck, Gregory
Language:ENG
Published: ScholarWorks@UMass Amherst 2011
Subjects:
Online Access:https://scholarworks.umass.edu/dissertations/AAI3482615
Description
Summary:Machine learning has facilitated many recent advances in natural language processing and information extraction. Unfortunately, most machine learning methods rely on costly labeled data, which impedes their application to new problems. Even in the absence of labeled data we often have a wealth of prior knowledge about these problems. For example, we may know which labels particular words are likely to indicate for a sequence labeling task, or we may have linguistic knowledge suggesting probable dependencies for syntactic analysis. This thesis focuses on incorporating such prior knowledge into learning, with the goal of reducing annotation effort for information extraction and natural language processing tasks. We advocate constraints on expectations as a flexible and interpretable language for encoding prior knowledge. We focus on the development of Generalized Expectation (GE), a method for learning with expectation constraints and unlabeled data. We explore the various flexibilities afforded by GE criteria, derive efficient algorithms for GE training, and relate GE to other methods for incorporating prior knowledge into learning. We then use GE to develop lightly supervised approaches to text classification, dependency parsing, sequence labeling, and entity resolution that yield accurate models for these tasks with minimal human effort. We also consider the incorporation of GE into interactive training systems that actively solicit prior knowledge from the user and assist the user in evaluating and analyzing model predictions.