Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach.
The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
Public Library of Science (PLoS)
2015-01-01
|
Series: | PLoS ONE |
Online Access: | http://europepmc.org/articles/PMC4608588?pdf=render |
id |
doaj-6b624cb89d554459a9546f136657833e |
---|---|
record_format |
Article |
spelling |
doaj-6b624cb89d554459a9546f136657833e2020-11-24T21:54:49ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-011010e014071810.1371/journal.pone.0140718Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach.Nico NagelkerkeVaclav FidlerThe problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations.http://europepmc.org/articles/PMC4608588?pdf=render |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Nico Nagelkerke Vaclav Fidler |
spellingShingle |
Nico Nagelkerke Vaclav Fidler Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach. PLoS ONE |
author_facet |
Nico Nagelkerke Vaclav Fidler |
author_sort |
Nico Nagelkerke |
title |
Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach. |
title_short |
Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach. |
title_full |
Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach. |
title_fullStr |
Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach. |
title_full_unstemmed |
Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach. |
title_sort |
estimating a logistic discrimination functions when one of the training samples is subject to misclassification: a maximum likelihood approach. |
publisher |
Public Library of Science (PLoS) |
series |
PLoS ONE |
issn |
1932-6203 |
publishDate |
2015-01-01 |
description |
The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations. |
url |
http://europepmc.org/articles/PMC4608588?pdf=render |
work_keys_str_mv |
AT niconagelkerke estimatingalogisticdiscriminationfunctionswhenoneofthetrainingsamplesissubjecttomisclassificationamaximumlikelihoodapproach AT vaclavfidler estimatingalogisticdiscriminationfunctionswhenoneofthetrainingsamplesissubjecttomisclassificationamaximumlikelihoodapproach |
_version_ |
1725865565144219648 |