YOLBO: You Only Look Back Once–A Low Latency Object Tracker Based on YOLO and Optical Flow
One common computer vision task is to track an object as it moves from frame to frame within a video sequence. There are a myriad of applications for such capability and the underlying technologies to achieve this tracking are very well understood. More recently, deep convolutional neural networks h...
Main Authors: | , |
---|---|
Format: | Article |
Language: | English |
Published: |
IEEE
2021-01-01
|
Series: | IEEE Access |
Subjects: | |
Online Access: | https://ieeexplore.ieee.org/document/9430527/ |
id |
doaj-fc90fcb66bda4be08c01fbdc4e324359 |
---|---|
record_format |
Article |
spelling |
doaj-fc90fcb66bda4be08c01fbdc4e3243592021-06-14T23:00:19ZengIEEEIEEE Access2169-35362021-01-019824978250710.1109/ACCESS.2021.30801369430527YOLBO: You Only Look Back Once–A Low Latency Object Tracker Based on YOLO and Optical FlowDaniel S. Kaputa0https://orcid.org/0000-0002-5620-6193Brian P. Landy1https://orcid.org/0000-0001-5688-4691Rochester Institute of Technology, Rochester, NY, USARochester Institute of Technology, Rochester, NY, USAOne common computer vision task is to track an object as it moves from frame to frame within a video sequence. There are a myriad of applications for such capability and the underlying technologies to achieve this tracking are very well understood. More recently, deep convolutional neural networks have been employed to not only track, but also to classify objects as they are tracked from frame to frame. These models can be used in a tracking paradigm known as tracking by detection and can achieve very high tracking accuracy. The major drawback to these deep neural networks is the large amount of mathematical operations that must be performed for each inference which negatively impacts the number of tracked frames per second. For edge applications residing on size, weight, and power limited platforms, such as unmanned aerial vehicles, high frame rate and low latency real time tracking can be an elusive target. To overcome the limited power and computational resources of an edge compute device, various optimizations have been performed to trade off tracking speed, accuracy, power, and latency. Previous works on motion based interpolation with neural networks either do not take into account the latency accrued from camera image capture to tracking result or they compensate for this latency but are bottlenecked by the motion interpolation operation instead. The algorithm presented in this work gains the performance speedup used in previous motion based neural network inference papers and also performs a novel look back operation that is less cumbersome than other competing motion interpolation methods.https://ieeexplore.ieee.org/document/9430527/CNNclassifierdetectorneural networklow latencytracker |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Daniel S. Kaputa Brian P. Landy |
spellingShingle |
Daniel S. Kaputa Brian P. Landy YOLBO: You Only Look Back Once–A Low Latency Object Tracker Based on YOLO and Optical Flow IEEE Access CNN classifier detector neural network low latency tracker |
author_facet |
Daniel S. Kaputa Brian P. Landy |
author_sort |
Daniel S. Kaputa |
title |
YOLBO: You Only Look Back Once–A Low Latency Object Tracker Based on YOLO and Optical Flow |
title_short |
YOLBO: You Only Look Back Once–A Low Latency Object Tracker Based on YOLO and Optical Flow |
title_full |
YOLBO: You Only Look Back Once–A Low Latency Object Tracker Based on YOLO and Optical Flow |
title_fullStr |
YOLBO: You Only Look Back Once–A Low Latency Object Tracker Based on YOLO and Optical Flow |
title_full_unstemmed |
YOLBO: You Only Look Back Once–A Low Latency Object Tracker Based on YOLO and Optical Flow |
title_sort |
yolbo: you only look back once–a low latency object tracker based on yolo and optical flow |
publisher |
IEEE |
series |
IEEE Access |
issn |
2169-3536 |
publishDate |
2021-01-01 |
description |
One common computer vision task is to track an object as it moves from frame to frame within a video sequence. There are a myriad of applications for such capability and the underlying technologies to achieve this tracking are very well understood. More recently, deep convolutional neural networks have been employed to not only track, but also to classify objects as they are tracked from frame to frame. These models can be used in a tracking paradigm known as tracking by detection and can achieve very high tracking accuracy. The major drawback to these deep neural networks is the large amount of mathematical operations that must be performed for each inference which negatively impacts the number of tracked frames per second. For edge applications residing on size, weight, and power limited platforms, such as unmanned aerial vehicles, high frame rate and low latency real time tracking can be an elusive target. To overcome the limited power and computational resources of an edge compute device, various optimizations have been performed to trade off tracking speed, accuracy, power, and latency. Previous works on motion based interpolation with neural networks either do not take into account the latency accrued from camera image capture to tracking result or they compensate for this latency but are bottlenecked by the motion interpolation operation instead. The algorithm presented in this work gains the performance speedup used in previous motion based neural network inference papers and also performs a novel look back operation that is less cumbersome than other competing motion interpolation methods. |
topic |
CNN classifier detector neural network low latency tracker |
url |
https://ieeexplore.ieee.org/document/9430527/ |
work_keys_str_mv |
AT danielskaputa yolboyouonlylookbackoncex2013alowlatencyobjecttrackerbasedonyoloandopticalflow AT brianplandy yolboyouonlylookbackoncex2013alowlatencyobjecttrackerbasedonyoloandopticalflow |
_version_ |
1721377871131312128 |