Efficiency of CNN on Heterogeneous Processing Devices
In the development of advanced driver assistance systems, computer vision problemsneed to be optimized to run efficiently on embedded platforms. Convolutional neural network(CNN) accelerators have proven to be very efficient for embedded camera platforms,such as the ones used for automotive vision s...
Main Author: | |
---|---|
Format: | Others |
Language: | English |
Published: |
Linköpings universitet, Programvara och system
2019
|
Subjects: | |
Online Access: | http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-155034 |
id |
ndltd-UPSALLA1-oai-DiVA.org-liu-155034 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-UPSALLA1-oai-DiVA.org-liu-1550342019-03-21T06:20:14ZEfficiency of CNN on Heterogeneous Processing DevicesengRingenson, JosefinLinköpings universitet, Programvara och system2019CNNAcceleratorConvolutionHeterogeneous Processing DeviceAI EngineFPGAHardware ArchitectureAutomotive SecurityComputer VisionLayer FusionComputer SystemsDatorsystemEmbedded SystemsInbäddad systemteknikIn the development of advanced driver assistance systems, computer vision problemsneed to be optimized to run efficiently on embedded platforms. Convolutional neural network(CNN) accelerators have proven to be very efficient for embedded camera platforms,such as the ones used for automotive vision systems. Therefore, the focus of this thesisis to evaluate the efficiency of a CNN on a future embedded heterogeneous processingdevice. The memory size in an embedded system is often very limited, and it is necessary todivide the input into multiple tiles. In addition, there are power and speed constraintsthat needs to be met to be able to use a computer vision system in a car. To increaseefficiency and optimize the memory usage, different methods for CNN layer fusion areproposed and evaluated for a variety of tile sizes. Several different layer fusion methods and input tile sizes are chosen as optimal solutions,depending on the depth of the layers in the CNN. The solutions investigated inthe thesis are most efficient for deep CNN layers, where the number of channels is high. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-155034application/pdfinfo:eu-repo/semantics/openAccess |
collection |
NDLTD |
language |
English |
format |
Others
|
sources |
NDLTD |
topic |
CNN Accelerator Convolution Heterogeneous Processing Device AI Engine FPGA Hardware Architecture Automotive Security Computer Vision Layer Fusion Computer Systems Datorsystem Embedded Systems Inbäddad systemteknik |
spellingShingle |
CNN Accelerator Convolution Heterogeneous Processing Device AI Engine FPGA Hardware Architecture Automotive Security Computer Vision Layer Fusion Computer Systems Datorsystem Embedded Systems Inbäddad systemteknik Ringenson, Josefin Efficiency of CNN on Heterogeneous Processing Devices |
description |
In the development of advanced driver assistance systems, computer vision problemsneed to be optimized to run efficiently on embedded platforms. Convolutional neural network(CNN) accelerators have proven to be very efficient for embedded camera platforms,such as the ones used for automotive vision systems. Therefore, the focus of this thesisis to evaluate the efficiency of a CNN on a future embedded heterogeneous processingdevice. The memory size in an embedded system is often very limited, and it is necessary todivide the input into multiple tiles. In addition, there are power and speed constraintsthat needs to be met to be able to use a computer vision system in a car. To increaseefficiency and optimize the memory usage, different methods for CNN layer fusion areproposed and evaluated for a variety of tile sizes. Several different layer fusion methods and input tile sizes are chosen as optimal solutions,depending on the depth of the layers in the CNN. The solutions investigated inthe thesis are most efficient for deep CNN layers, where the number of channels is high. |
author |
Ringenson, Josefin |
author_facet |
Ringenson, Josefin |
author_sort |
Ringenson, Josefin |
title |
Efficiency of CNN on Heterogeneous Processing Devices |
title_short |
Efficiency of CNN on Heterogeneous Processing Devices |
title_full |
Efficiency of CNN on Heterogeneous Processing Devices |
title_fullStr |
Efficiency of CNN on Heterogeneous Processing Devices |
title_full_unstemmed |
Efficiency of CNN on Heterogeneous Processing Devices |
title_sort |
efficiency of cnn on heterogeneous processing devices |
publisher |
Linköpings universitet, Programvara och system |
publishDate |
2019 |
url |
http://urn.kb.se/resolve?urn=urn:nbn:se:liu:diva-155034 |
work_keys_str_mv |
AT ringensonjosefin efficiencyofcnnonheterogeneousprocessingdevices |
_version_ |
1719005155584638976 |