Automated verbal autopsy classification: using one-against-all ensemble method and Naïve Bayes classifier [version 2; referees: 2 approved]

Verbal autopsy (VA) deals with post-mortem surveys about deaths, mostly in low and middle income countries, where the majority of deaths occur at home rather than a hospital, for retrospective assignment of causes of death (COD) and subsequently evidence-based health system strengthening. Automated...

Full description

Bibliographic Details
Main Authors: Syed Shariyar Murtaza, Patrycja Kolpak, Ayse Bener, Prabhat Jha
Format: Article
Language:English
Published: F1000 Research Ltd 2019-01-01
Series:Gates Open Research
Online Access:https://gatesopenresearch.org/articles/2-63/v2
id doaj-65e3938961e24793b69e1b84b5f58ac7
record_format Article
spelling doaj-65e3938961e24793b69e1b84b5f58ac72020-11-25T03:44:22ZengF1000 Research LtdGates Open Research2572-47542019-01-01210.12688/gatesopenres.12891.214008Automated verbal autopsy classification: using one-against-all ensemble method and Naïve Bayes classifier [version 2; referees: 2 approved]Syed Shariyar Murtaza0Patrycja Kolpak1Ayse Bener2Prabhat Jha3Data Science Lab, Ryerson University, Toronto, Ontario, M5B 2K3, CanadaCentre for Global Health Research, St. Michael's Hospital, Toronto, Ontario, CanadaData Science Lab, Ryerson University, Toronto, Ontario, M5B 2K3, CanadaCentre for Global Health Research, St. Michael's Hospital, Toronto, Ontario, CanadaVerbal autopsy (VA) deals with post-mortem surveys about deaths, mostly in low and middle income countries, where the majority of deaths occur at home rather than a hospital, for retrospective assignment of causes of death (COD) and subsequently evidence-based health system strengthening. Automated algorithms for VA COD assignment have been developed and their performance has been assessed against physician and clinical diagnoses. Since the performance of automated classification methods remains low, we aimed to enhance the Naïve Bayes Classifier (NBC) algorithm to produce better ranked COD classifications on 26,766 deaths from four globally diverse VA datasets compared to some of the leading VA classification methods, namely Tariff, InterVA-4, InSilicoVA and NBC. We used a different strategy, by training multiple NBC algorithms using the one-against-all approach (OAA-NBC). To compare performance, we computed the cumulative cause-specific mortality fraction (CSMF) accuracies for population-level agreement from rank one to five COD classifications. To assess individual-level COD assignments, cumulative partially-chance corrected concordance (PCCC) and sensitivity was measured for up to five ranked classifications. Overall results show that OAA-NBC consistently assigns CODs that are the most alike physician and clinical COD assignments compared to some of the leading algorithms based on the cumulative CSMF accuracy, PCCC and sensitivity scores. The results demonstrate that our approach improves the performance of classification (sensitivity) by between 6% and 8% compared with other VA algorithms. Population-level agreements for OAA-NBC and NBC were found to be similar or higher than the other algorithms used in the experiments. Although OAA-NBC still requires improvement for individual-level COD assignment, the one-against-all approach improved its ability to assign CODs that more closely resemble physician or clinical COD classifications compared to some of the other leading VA classifiers.https://gatesopenresearch.org/articles/2-63/v2
collection DOAJ
language English
format Article
sources DOAJ
author Syed Shariyar Murtaza
Patrycja Kolpak
Ayse Bener
Prabhat Jha
spellingShingle Syed Shariyar Murtaza
Patrycja Kolpak
Ayse Bener
Prabhat Jha
Automated verbal autopsy classification: using one-against-all ensemble method and Naïve Bayes classifier [version 2; referees: 2 approved]
Gates Open Research
author_facet Syed Shariyar Murtaza
Patrycja Kolpak
Ayse Bener
Prabhat Jha
author_sort Syed Shariyar Murtaza
title Automated verbal autopsy classification: using one-against-all ensemble method and Naïve Bayes classifier [version 2; referees: 2 approved]
title_short Automated verbal autopsy classification: using one-against-all ensemble method and Naïve Bayes classifier [version 2; referees: 2 approved]
title_full Automated verbal autopsy classification: using one-against-all ensemble method and Naïve Bayes classifier [version 2; referees: 2 approved]
title_fullStr Automated verbal autopsy classification: using one-against-all ensemble method and Naïve Bayes classifier [version 2; referees: 2 approved]
title_full_unstemmed Automated verbal autopsy classification: using one-against-all ensemble method and Naïve Bayes classifier [version 2; referees: 2 approved]
title_sort automated verbal autopsy classification: using one-against-all ensemble method and naïve bayes classifier [version 2; referees: 2 approved]
publisher F1000 Research Ltd
series Gates Open Research
issn 2572-4754
publishDate 2019-01-01
description Verbal autopsy (VA) deals with post-mortem surveys about deaths, mostly in low and middle income countries, where the majority of deaths occur at home rather than a hospital, for retrospective assignment of causes of death (COD) and subsequently evidence-based health system strengthening. Automated algorithms for VA COD assignment have been developed and their performance has been assessed against physician and clinical diagnoses. Since the performance of automated classification methods remains low, we aimed to enhance the Naïve Bayes Classifier (NBC) algorithm to produce better ranked COD classifications on 26,766 deaths from four globally diverse VA datasets compared to some of the leading VA classification methods, namely Tariff, InterVA-4, InSilicoVA and NBC. We used a different strategy, by training multiple NBC algorithms using the one-against-all approach (OAA-NBC). To compare performance, we computed the cumulative cause-specific mortality fraction (CSMF) accuracies for population-level agreement from rank one to five COD classifications. To assess individual-level COD assignments, cumulative partially-chance corrected concordance (PCCC) and sensitivity was measured for up to five ranked classifications. Overall results show that OAA-NBC consistently assigns CODs that are the most alike physician and clinical COD assignments compared to some of the leading algorithms based on the cumulative CSMF accuracy, PCCC and sensitivity scores. The results demonstrate that our approach improves the performance of classification (sensitivity) by between 6% and 8% compared with other VA algorithms. Population-level agreements for OAA-NBC and NBC were found to be similar or higher than the other algorithms used in the experiments. Although OAA-NBC still requires improvement for individual-level COD assignment, the one-against-all approach improved its ability to assign CODs that more closely resemble physician or clinical COD classifications compared to some of the other leading VA classifiers.
url https://gatesopenresearch.org/articles/2-63/v2
work_keys_str_mv AT syedshariyarmurtaza automatedverbalautopsyclassificationusingoneagainstallensemblemethodandnaivebayesclassifierversion2referees2approved
AT patrycjakolpak automatedverbalautopsyclassificationusingoneagainstallensemblemethodandnaivebayesclassifierversion2referees2approved
AT aysebener automatedverbalautopsyclassificationusingoneagainstallensemblemethodandnaivebayesclassifierversion2referees2approved
AT prabhatjha automatedverbalautopsyclassificationusingoneagainstallensemblemethodandnaivebayesclassifierversion2referees2approved
_version_ 1724515382691102720