Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach.

The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are...

Full description

Bibliographic Details
Main Authors:	Nico Nagelkerke, Vaclav Fidler
Format:	Article
Language:	English
Published:	Public Library of Science (PLoS) 2015-01-01
Series:	PLoS ONE
Online Access:	http://europepmc.org/articles/PMC4608588?pdf=render

id	doaj-6b624cb89d554459a9546f136657833e
record_format	Article
spelling	doaj-6b624cb89d554459a9546f136657833e2020-11-24T21:54:49ZengPublic Library of Science (PLoS)PLoS ONE1932-62032015-01-011010e014071810.1371/journal.pone.0140718Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach.Nico NagelkerkeVaclav FidlerThe problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations.http://europepmc.org/articles/PMC4608588?pdf=render
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Nico Nagelkerke Vaclav Fidler
spellingShingle	Nico Nagelkerke Vaclav Fidler Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach. PLoS ONE
author_facet	Nico Nagelkerke Vaclav Fidler
author_sort	Nico Nagelkerke
title	Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach.
title_short	Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach.
title_full	Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach.
title_fullStr	Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach.
title_full_unstemmed	Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach.
title_sort	estimating a logistic discrimination functions when one of the training samples is subject to misclassification: a maximum likelihood approach.
publisher	Public Library of Science (PLoS)
series	PLoS ONE
issn	1932-6203
publishDate	2015-01-01
description	The problem of discrimination and classification is central to much of epidemiology. Here we consider the estimation of a logistic regression/discrimination function from training samples, when one of the training samples is subject to misclassification or mislabeling, e.g. diseased individuals are incorrectly classified/labeled as healthy controls. We show that this leads to zero-inflated binomial model with a defective logistic regression or discrimination function, whose parameters can be estimated using standard statistical methods such as maximum likelihood. These parameters can be used to estimate the probability of true group membership among those, possibly erroneously, classified as controls. Two examples are analyzed and discussed. A simulation study explores properties of the maximum likelihood parameter estimates and the estimates of the number of mislabeled observations.
url	http://europepmc.org/articles/PMC4608588?pdf=render
work_keys_str_mv	AT niconagelkerke estimatingalogisticdiscriminationfunctionswhenoneofthetrainingsamplesissubjecttomisclassificationamaximumlikelihoodapproach AT vaclavfidler estimatingalogisticdiscriminationfunctionswhenoneofthetrainingsamplesissubjecttomisclassificationamaximumlikelihoodapproach
_version_	1725865565144219648

Estimating a Logistic Discrimination Functions When One of the Training Samples Is Subject to Misclassification: A Maximum Likelihood Approach.

Similar Items