Transfer Learning and Decision Fusion for Real Time Distortion Classification in Laparoscopic Videos

Laparoscopic surgery is a surgical procedure performed by inserting narrow tubes into the abdomen without making large incisions in the skin. It is done with the aid of a video camera. Laparoscopic videos are affected by various distortions during surgery which lead to loss of visual quality. Identi...

Full description

Bibliographic Details
Main Authors: Nouar Aldahoul, Hezerul Abdul Karim, Myles Joshua Toledo Tan, Jamie Ledesma Fermin
Format: Article
Language:English
Published: IEEE 2021-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/9514915/
id doaj-ab3f9b32305e4e7bb4521350c8d01ee2
record_format Article
spelling doaj-ab3f9b32305e4e7bb4521350c8d01ee22021-08-23T23:00:32ZengIEEEIEEE Access2169-35362021-01-01911500611501810.1109/ACCESS.2021.31054549514915Transfer Learning and Decision Fusion for Real Time Distortion Classification in Laparoscopic VideosNouar Aldahoul0https://orcid.org/0000-0001-5522-0033Hezerul Abdul Karim1Myles Joshua Toledo Tan2https://orcid.org/0000-0002-1426-6526Jamie Ledesma Fermin3https://orcid.org/0000-0002-7960-3917Faculty of Engineering, Multimedia University, Cyberjaya, MalaysiaFaculty of Engineering, Multimedia University, Cyberjaya, MalaysiaYo-Vivo Corporation, Bacolod, PhilippinesYo-Vivo Corporation, Bacolod, PhilippinesLaparoscopic surgery is a surgical procedure performed by inserting narrow tubes into the abdomen without making large incisions in the skin. It is done with the aid of a video camera. Laparoscopic videos are affected by various distortions during surgery which lead to loss of visual quality. Identification of these distortions is the primary requisite in automated video enhancement systems used to classify the distortions correctly and accordingly select the proper algorithm to enhance video quality. In addition to high accuracy, the speed of distortion classification should be high, and the system must consider real-time conditions. This paper aims to address the issues faced by similar methods by developing a fast and accurate deep learning model for distortion classification. The dataset proposed by the ICIP2020 conference challenge was used for training and evaluation of the proposed method. This challenging dataset contains videos that have five types of distortions such as noise, smoke, uneven illumination, defocus blur, and motion blur with four levels of intensity. This paper discusses the proposed solution which received the first prize in the ICIP2020 challenge. The solution utilized a transfer learning approach to transfer representation from the domain of natural images to the domain of laparoscopic videos. We used a pre-trained ResNet50 convolutional neural network (CNN) to extract informative features that were mapped by support vector machine (SVM) classifiers to various distortion categories. In this work, the problem of multiple distortions in the same video was formulated as a multi-label distortion classification problem. The approach of transfer learning with decision fusion was applied and was found to outperform other solutions in terms of accuracy (83%), F1 score of a single distortion (94.7%), and F1 score of single and multiple distortions (94.9%). In addition, the proposed solution can run in real time with an inference speed of 20 frames per second (FPS).https://ieeexplore.ieee.org/document/9514915/Decision fusiondistortion classificationlaparoscopic videomultilabel classificationreal timetransfer learning
collection DOAJ
language English
format Article
sources DOAJ
author Nouar Aldahoul
Hezerul Abdul Karim
Myles Joshua Toledo Tan
Jamie Ledesma Fermin
spellingShingle Nouar Aldahoul
Hezerul Abdul Karim
Myles Joshua Toledo Tan
Jamie Ledesma Fermin
Transfer Learning and Decision Fusion for Real Time Distortion Classification in Laparoscopic Videos
IEEE Access
Decision fusion
distortion classification
laparoscopic video
multilabel classification
real time
transfer learning
author_facet Nouar Aldahoul
Hezerul Abdul Karim
Myles Joshua Toledo Tan
Jamie Ledesma Fermin
author_sort Nouar Aldahoul
title Transfer Learning and Decision Fusion for Real Time Distortion Classification in Laparoscopic Videos
title_short Transfer Learning and Decision Fusion for Real Time Distortion Classification in Laparoscopic Videos
title_full Transfer Learning and Decision Fusion for Real Time Distortion Classification in Laparoscopic Videos
title_fullStr Transfer Learning and Decision Fusion for Real Time Distortion Classification in Laparoscopic Videos
title_full_unstemmed Transfer Learning and Decision Fusion for Real Time Distortion Classification in Laparoscopic Videos
title_sort transfer learning and decision fusion for real time distortion classification in laparoscopic videos
publisher IEEE
series IEEE Access
issn 2169-3536
publishDate 2021-01-01
description Laparoscopic surgery is a surgical procedure performed by inserting narrow tubes into the abdomen without making large incisions in the skin. It is done with the aid of a video camera. Laparoscopic videos are affected by various distortions during surgery which lead to loss of visual quality. Identification of these distortions is the primary requisite in automated video enhancement systems used to classify the distortions correctly and accordingly select the proper algorithm to enhance video quality. In addition to high accuracy, the speed of distortion classification should be high, and the system must consider real-time conditions. This paper aims to address the issues faced by similar methods by developing a fast and accurate deep learning model for distortion classification. The dataset proposed by the ICIP2020 conference challenge was used for training and evaluation of the proposed method. This challenging dataset contains videos that have five types of distortions such as noise, smoke, uneven illumination, defocus blur, and motion blur with four levels of intensity. This paper discusses the proposed solution which received the first prize in the ICIP2020 challenge. The solution utilized a transfer learning approach to transfer representation from the domain of natural images to the domain of laparoscopic videos. We used a pre-trained ResNet50 convolutional neural network (CNN) to extract informative features that were mapped by support vector machine (SVM) classifiers to various distortion categories. In this work, the problem of multiple distortions in the same video was formulated as a multi-label distortion classification problem. The approach of transfer learning with decision fusion was applied and was found to outperform other solutions in terms of accuracy (83%), F1 score of a single distortion (94.7%), and F1 score of single and multiple distortions (94.9%). In addition, the proposed solution can run in real time with an inference speed of 20 frames per second (FPS).
topic Decision fusion
distortion classification
laparoscopic video
multilabel classification
real time
transfer learning
url https://ieeexplore.ieee.org/document/9514915/
work_keys_str_mv AT nouaraldahoul transferlearninganddecisionfusionforrealtimedistortionclassificationinlaparoscopicvideos
AT hezerulabdulkarim transferlearninganddecisionfusionforrealtimedistortionclassificationinlaparoscopicvideos
AT mylesjoshuatoledotan transferlearninganddecisionfusionforrealtimedistortionclassificationinlaparoscopicvideos
AT jamieledesmafermin transferlearninganddecisionfusionforrealtimedistortionclassificationinlaparoscopicvideos
_version_ 1721198082249457664