Co-Training and Ensemble Learning for Duplicate Detection in Adverse Drug Event Reporting Systems

碩士 === 國立高雄大學 === 資訊工程學系碩士班 === 101 === Adverse drug reactions detection is a very important topic in the public health as well as the development of modern pharmaceutical industry. Since the number of samples in clinical trials is not enough to identify potential adverse drug reactions before the d...

Full description

Bibliographic Details
Main Authors: Chiao-Feng Lo, 羅喬楓
Other Authors: Wen-Yang Lin
Format: Others
Language:en_US
Published: 2013
Online Access:http://ndltd.ncl.edu.tw/handle/74533117604020370925
Description
Summary:碩士 === 國立高雄大學 === 資訊工程學系碩士班 === 101 === Adverse drug reactions detection is a very important topic in the public health as well as the development of modern pharmaceutical industry. Since the number of samples in clinical trials is not enough to identify potential adverse drug reactions before the drugs are approved for marketing, many countries have established various spontaneous reporting systems (SRSs) to facilitate postmarketing surveillance of listed drugs and collect enough data for detecting unknown adverse drug reactions. Unfortunately, due to data in SRSs coming from different sources of reporters, there heralds the problem of duplicate reporting; even a small amount of duplicate records would bias the detection results. Although lots of works have been conducted on duplicate record detection, very few of them have been devoted to dataset about adverse drug reactions, and none of them have considered the existence of follow-up reports. Thus contemporary methods tailored to detecting duplicate ADR report are inept to discriminate real duplicate from follow-up linkage. In this study, we investigated the problem of identifying duplicate ADR reports in SRSs with the presence of follow-ups. We propose an ensemble and co-training based detection method that is capable of detecting for a given report not only its duplicates but also its initial or earlier linkage cases.