Mutual Information as a Performance Measure for Binary Predictors Characterized by Both ROC Curve and PROC Curve Analysis

The predictive receiver operating characteristic (PROC) curve differs from the more<br />well-known receiver operating characteristic (ROC) curve in that it provides a basis for the<br />evaluation of binary diagnostic tests using metrics defined conditionally on the outcome of the test&...

Full description

Bibliographic Details
Main Authors: Gareth Hughes, Jennifer Kopetzky, Neil McRoberts
Format: Article
Language:English
Published: MDPI AG 2020-08-01
Series:Entropy
Subjects:
Online Access:https://www.mdpi.com/1099-4300/22/9/938
Description
Summary:The predictive receiver operating characteristic (PROC) curve differs from the more<br />well-known receiver operating characteristic (ROC) curve in that it provides a basis for the<br />evaluation of binary diagnostic tests using metrics defined conditionally on the outcome of the test<br />rather than metrics defined conditionally on the actual disease status. Application of PROC curve<br />analysis may be hindered by the complex graphical patterns that are sometimes generated. Here<br />we present an information theoretic analysis that allows concurrent evaluation of PROC curves and<br />ROC curves together in a simple graphical format. The analysis is based on the observation that<br />mutual information may be viewed both as a function of ROC curve summary statistics (sensitivity<br />and specificity) and prevalence, and as a function of predictive values and prevalence. Mutual<br />information calculated from a 2 × 2 prediction-realization table for a specified risk score threshold<br />on an ROC curve is the same as the mutual information calculated at the same risk score threshold<br />on a corresponding PROC curve. Thus, for a given value of prevalence, the risk score threshold that<br />maximizes mutual information is the same on both the ROC curve and the corresponding PROC<br />curve. Phytopathologists and clinicians who have previously relied solely on ROC curve summary<br />statistics when formulating risk thresholds for application in practical agricultural or clinical<br />decision-making contexts are thus presented with a methodology that brings predictive values<br />within the scope of that formulation.
ISSN:1099-4300