Feature Factory : a collaborative, crowd-sourced machine learning system

Thesis: M. Eng. in Computer Science and Engineering, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015. === This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Colle...

Full description

Bibliographic Details
Main Author: Wang, Alex Christopher
Other Authors: Kalyan Veeramachaneni.
Format: Others
Language:English
Published: Massachusetts Institute of Technology 2016
Subjects:
Online Access:http://hdl.handle.net/1721.1/100859
Description
Summary:Thesis: M. Eng. in Computer Science and Engineering, Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2015. === This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. === Cataloged from student-submitted PDF version of thesis. === Includes bibliographical references (page 71). === In this thesis, I designed, implemented, and tested a machine learning learning system designed to crowd-source feature discovery called Feature Factory. Feature Factory provides a complete web-based platform for users to define, extract, and test features on any given machine learning problem. This project involved designing, implementing, and testing a proof-of-concept version of this platform. Creating the platform involved developing user-side infrastructure and system-side infrastructure. The user-side infrastructure required careful design decisions to provide users with a clear and concise interface and workflow. The system-side infrastructure involved constructing an automated feature aggregation, extraction, and testing pipeline that can be executed with a few simple commands. Testing was performed by presenting three different machine learning problems to test users via the user-side infrastructure of Feature Factory. Users were asked to write features for the three different machine learning problems as well as comment on the usability of the system. The systemside infrastructure was utilized to analyze the effectiveness and performance of the features written by the users. === by Alex Christopher Wang. === M. Eng. in Computer Science and Engineering