Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation

When deploying a model for object detection, a confidence score threshold is chosen to filter out false positives and ensure that a predicted bounding box has a certain minimum score. To achieve state-of-the-art performance on benchmark datasets, most neural networks use a rather low threshold as a...

Full description

Bibliographic Details
Main Authors: Simon Wenkel, Khaled Alhazmi, Tanel Liiv, Saud Alrshoud, Martin Simon
Format: Article
Language:English
Published: MDPI AG 2021-06-01
Series:Sensors
Subjects:
Online Access:https://www.mdpi.com/1424-8220/21/13/4350
id doaj-48fc7dcb1c964241b36dcd4fdba31fa7
record_format Article
spelling doaj-48fc7dcb1c964241b36dcd4fdba31fa72021-07-15T15:45:09ZengMDPI AGSensors1424-82202021-06-01214350435010.3390/s21134350Confidence Score: The Forgotten Dimension of Object Detection Performance EvaluationSimon Wenkel0Khaled Alhazmi1Tanel Liiv2Saud Alrshoud3Martin Simon4Marduk Technologies OÜ, 12618 Tallinn, EstoniaNational Center for Robotics and Internet of Things Technology, Communication and Information Technologies Research Institute, King Abdulaziz City for Science and Technology—KACST, Riyadh 11442, Saudi ArabiaMarduk Technologies OÜ, 12618 Tallinn, EstoniaNational Center for Robotics and Internet of Things Technology, Communication and Information Technologies Research Institute, King Abdulaziz City for Science and Technology—KACST, Riyadh 11442, Saudi ArabiaMarduk Technologies OÜ, 12618 Tallinn, EstoniaWhen deploying a model for object detection, a confidence score threshold is chosen to filter out false positives and ensure that a predicted bounding box has a certain minimum score. To achieve state-of-the-art performance on benchmark datasets, most neural networks use a rather low threshold as a high number of false positives is not penalized by standard evaluation metrics. However, in scenarios of Artificial Intelligence (AI) applications that require high confidence scores (e.g., due to legal requirements or consequences of incorrect detections are severe) or a certain level of model robustness is required, it is unclear which base model to use since they were mainly optimized for benchmark scores. In this paper, we propose a method to find the optimum performance point of a model as a basis for fairer comparison and deeper insights into the trade-offs caused by selecting a confidence score threshold.https://www.mdpi.com/1424-8220/21/13/4350computer visiondeep neural networksobject detectionconfidence score
collection DOAJ
language English
format Article
sources DOAJ
author Simon Wenkel
Khaled Alhazmi
Tanel Liiv
Saud Alrshoud
Martin Simon
spellingShingle Simon Wenkel
Khaled Alhazmi
Tanel Liiv
Saud Alrshoud
Martin Simon
Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation
Sensors
computer vision
deep neural networks
object detection
confidence score
author_facet Simon Wenkel
Khaled Alhazmi
Tanel Liiv
Saud Alrshoud
Martin Simon
author_sort Simon Wenkel
title Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation
title_short Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation
title_full Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation
title_fullStr Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation
title_full_unstemmed Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation
title_sort confidence score: the forgotten dimension of object detection performance evaluation
publisher MDPI AG
series Sensors
issn 1424-8220
publishDate 2021-06-01
description When deploying a model for object detection, a confidence score threshold is chosen to filter out false positives and ensure that a predicted bounding box has a certain minimum score. To achieve state-of-the-art performance on benchmark datasets, most neural networks use a rather low threshold as a high number of false positives is not penalized by standard evaluation metrics. However, in scenarios of Artificial Intelligence (AI) applications that require high confidence scores (e.g., due to legal requirements or consequences of incorrect detections are severe) or a certain level of model robustness is required, it is unclear which base model to use since they were mainly optimized for benchmark scores. In this paper, we propose a method to find the optimum performance point of a model as a basis for fairer comparison and deeper insights into the trade-offs caused by selecting a confidence score threshold.
topic computer vision
deep neural networks
object detection
confidence score
url https://www.mdpi.com/1424-8220/21/13/4350
work_keys_str_mv AT simonwenkel confidencescoretheforgottendimensionofobjectdetectionperformanceevaluation
AT khaledalhazmi confidencescoretheforgottendimensionofobjectdetectionperformanceevaluation
AT tanelliiv confidencescoretheforgottendimensionofobjectdetectionperformanceevaluation
AT saudalrshoud confidencescoretheforgottendimensionofobjectdetectionperformanceevaluation
AT martinsimon confidencescoretheforgottendimensionofobjectdetectionperformanceevaluation
_version_ 1721298506972397568