Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation
When deploying a model for object detection, a confidence score threshold is chosen to filter out false positives and ensure that a predicted bounding box has a certain minimum score. To achieve state-of-the-art performance on benchmark datasets, most neural networks use a rather low threshold as a...
Main Authors: | , , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2021-06-01
|
Series: | Sensors |
Subjects: | |
Online Access: | https://www.mdpi.com/1424-8220/21/13/4350 |
id |
doaj-48fc7dcb1c964241b36dcd4fdba31fa7 |
---|---|
record_format |
Article |
spelling |
doaj-48fc7dcb1c964241b36dcd4fdba31fa72021-07-15T15:45:09ZengMDPI AGSensors1424-82202021-06-01214350435010.3390/s21134350Confidence Score: The Forgotten Dimension of Object Detection Performance EvaluationSimon Wenkel0Khaled Alhazmi1Tanel Liiv2Saud Alrshoud3Martin Simon4Marduk Technologies OÜ, 12618 Tallinn, EstoniaNational Center for Robotics and Internet of Things Technology, Communication and Information Technologies Research Institute, King Abdulaziz City for Science and Technology—KACST, Riyadh 11442, Saudi ArabiaMarduk Technologies OÜ, 12618 Tallinn, EstoniaNational Center for Robotics and Internet of Things Technology, Communication and Information Technologies Research Institute, King Abdulaziz City for Science and Technology—KACST, Riyadh 11442, Saudi ArabiaMarduk Technologies OÜ, 12618 Tallinn, EstoniaWhen deploying a model for object detection, a confidence score threshold is chosen to filter out false positives and ensure that a predicted bounding box has a certain minimum score. To achieve state-of-the-art performance on benchmark datasets, most neural networks use a rather low threshold as a high number of false positives is not penalized by standard evaluation metrics. However, in scenarios of Artificial Intelligence (AI) applications that require high confidence scores (e.g., due to legal requirements or consequences of incorrect detections are severe) or a certain level of model robustness is required, it is unclear which base model to use since they were mainly optimized for benchmark scores. In this paper, we propose a method to find the optimum performance point of a model as a basis for fairer comparison and deeper insights into the trade-offs caused by selecting a confidence score threshold.https://www.mdpi.com/1424-8220/21/13/4350computer visiondeep neural networksobject detectionconfidence score |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Simon Wenkel Khaled Alhazmi Tanel Liiv Saud Alrshoud Martin Simon |
spellingShingle |
Simon Wenkel Khaled Alhazmi Tanel Liiv Saud Alrshoud Martin Simon Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation Sensors computer vision deep neural networks object detection confidence score |
author_facet |
Simon Wenkel Khaled Alhazmi Tanel Liiv Saud Alrshoud Martin Simon |
author_sort |
Simon Wenkel |
title |
Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation |
title_short |
Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation |
title_full |
Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation |
title_fullStr |
Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation |
title_full_unstemmed |
Confidence Score: The Forgotten Dimension of Object Detection Performance Evaluation |
title_sort |
confidence score: the forgotten dimension of object detection performance evaluation |
publisher |
MDPI AG |
series |
Sensors |
issn |
1424-8220 |
publishDate |
2021-06-01 |
description |
When deploying a model for object detection, a confidence score threshold is chosen to filter out false positives and ensure that a predicted bounding box has a certain minimum score. To achieve state-of-the-art performance on benchmark datasets, most neural networks use a rather low threshold as a high number of false positives is not penalized by standard evaluation metrics. However, in scenarios of Artificial Intelligence (AI) applications that require high confidence scores (e.g., due to legal requirements or consequences of incorrect detections are severe) or a certain level of model robustness is required, it is unclear which base model to use since they were mainly optimized for benchmark scores. In this paper, we propose a method to find the optimum performance point of a model as a basis for fairer comparison and deeper insights into the trade-offs caused by selecting a confidence score threshold. |
topic |
computer vision deep neural networks object detection confidence score |
url |
https://www.mdpi.com/1424-8220/21/13/4350 |
work_keys_str_mv |
AT simonwenkel confidencescoretheforgottendimensionofobjectdetectionperformanceevaluation AT khaledalhazmi confidencescoretheforgottendimensionofobjectdetectionperformanceevaluation AT tanelliiv confidencescoretheforgottendimensionofobjectdetectionperformanceevaluation AT saudalrshoud confidencescoretheforgottendimensionofobjectdetectionperformanceevaluation AT martinsimon confidencescoretheforgottendimensionofobjectdetectionperformanceevaluation |
_version_ |
1721298506972397568 |