Medical abstract inference dataset

Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017. === This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. === Cataloged from student-s...

Full description

Bibliographic Details
Main Author: De León, Eduardo Enrique
Other Authors: Regina Barzilay.
Format: Others
Language:English
Published: Massachusetts Institute of Technology 2018
Subjects:
Online Access:http://hdl.handle.net/1721.1/119516
Description
Summary:Thesis: M. Eng., Massachusetts Institute of Technology, Department of Electrical Engineering and Computer Science, 2017. === This electronic version was submitted by the student author. The certified thesis is available in the Institute Archives and Special Collections. === Cataloged from student-submitted PDF version of thesis. === Includes bibliographical references (page 35). === In this thesis, I built a dataset for predicting clinical outcomes from medical abstracts and their title. Medical Abstract Inference consists of 1,794 data points. Titles were filtered to include the abstract's reported medical intervention and clinical outcome. Data points were annotated with the interventions effect on the outcome. Resulting labels were one of the following: increased, decreased, or had no significant difference on the outcome. In addition, rationale sentences were marked, these sentences supply the necessary supporting evidence for the overall prediction. Preliminary modeling was also done to evaluate the corpus. Preliminary models included top performing Natural Language Inference models as well as Rationale based models and linear classifiers. === by Eduardo Enrique de León. === M. Eng.