Multiscale image representation in deep learning
Deep learning is a very popular field of research which can input a variety of data types [1, 16, 30]. It is a subfield of machine learning consisting of mostly neural networks. A challenge which is very commonly met in the training of neural networks, especially when working with images is the vast...
Main Author: | |
---|---|
Other Authors: | |
Language: | en |
Published: |
University of Pretoria
2021
|
Subjects: | |
Online Access: | http://hdl.handle.net/2263/78037 |
id |
ndltd-netd.ac.za-oai-union.ndltd.org-up-oai-repository.up.ac.za-2263-78037 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-netd.ac.za-oai-union.ndltd.org-up-oai-repository.up.ac.za-2263-780372021-10-26T05:16:17Z Multiscale image representation in deep learning Stander, Jean-Pierre Fabris-Rotelli, Inger Nicolette u15002536@tuks.co.za UCTD Mathematical Statistics Deep learning is a very popular field of research which can input a variety of data types [1, 16, 30]. It is a subfield of machine learning consisting of mostly neural networks. A challenge which is very commonly met in the training of neural networks, especially when working with images is the vast amount of data required. Because of this various data augmentation techniques have been proposed to create more data at low cost while keeping the labelling of the data accurate [65]. When a model is trained on images these augmentations include rotating, flipping and cropping the images [21]. An added advantage of data augmentation is that it makes the model more robust to rotation and transformation of an object in an image [65]. In this mini-dissertation we investigate the use of the Discrete Pulse Transform [54, 2] decomposition algorithm and its Discrete Pulse Vectors (DPV) [17] as data augmentation for image classification in deep learning. The DPVs is used to extract features from the image. A convolutional neural network is trained on the original and augmented images and a comparison made to a convolutional neural network only trained on the unaugmented images. The purpose of the models implemented is to correctly classify an image as either a cat or dog. The training and testing accuracy of the two approaches are similar. The loss of the model using the proposed data augmentation is improved. When making use of probabilities predicted by the model and determining a custom cut off to classify an image into one of the two classes, the model trained on using the proposed augmentation outperforms the model trained without the proposed data augmentation. Mini Dissertation (MSc (Advanced Data Analytics))--University of Pretoria, 2020. The financial assistance of the National Research Foundation (NRF) towards this research is hereby acknowledged. Opinions expressed and conclusions arrived at, are those of the author and are not necessarily to be attributed to the NRF. Statistics MSc (Advanced Data Analytics) Unrestricted 2021-01-14T18:19:52Z 2021-01-14T18:19:52Z 2021-05-05 2020 Mini Dissertation http://hdl.handle.net/2263/78037 * A2021 en © 2019 University of Pretoria. All rights reserved. The copyright in this work vests in the University of Pretoria. No part of this work may be reproduced or transmitted in any form or by any means, without the prior written permission of the University of Pretoria. University of Pretoria |
collection |
NDLTD |
language |
en |
sources |
NDLTD |
topic |
UCTD Mathematical Statistics |
spellingShingle |
UCTD Mathematical Statistics Stander, Jean-Pierre Multiscale image representation in deep learning |
description |
Deep learning is a very popular field of research which can input a variety of data types [1, 16, 30]. It is a subfield of machine learning consisting of mostly neural networks. A challenge which is very commonly met in the training of neural networks, especially when working with images is the vast amount of data required. Because of this various data augmentation techniques have been proposed to create more data at low cost while keeping the labelling of the data accurate [65]. When a model is trained on images these augmentations include rotating, flipping and cropping the images [21]. An added advantage of data
augmentation is that it makes the model more robust to rotation and transformation of an object in an image [65].
In this mini-dissertation we investigate the use of the Discrete Pulse Transform [54, 2] decomposition algorithm and its Discrete Pulse Vectors (DPV) [17] as data augmentation for image classification in deep learning. The DPVs is used to extract features from the image. A convolutional neural network is trained on the original and augmented images and a comparison made to a convolutional neural network only trained on the unaugmented images. The purpose of the models implemented is to correctly classify an image as either a cat or dog. The training and testing accuracy of the two approaches are similar. The loss of the model using the proposed data augmentation is improved. When making use of probabilities predicted by the model and determining a custom cut off to classify an image into one of the two classes, the model trained on using the proposed augmentation outperforms the model trained without the proposed data augmentation. === Mini Dissertation (MSc (Advanced Data Analytics))--University of Pretoria, 2020. === The financial assistance of the National Research Foundation (NRF) towards this research is hereby acknowledged. Opinions expressed and conclusions arrived at, are those of the author and are not necessarily to be attributed to the NRF. === Statistics === MSc (Advanced Data Analytics) === Unrestricted |
author2 |
Fabris-Rotelli, Inger Nicolette |
author_facet |
Fabris-Rotelli, Inger Nicolette Stander, Jean-Pierre |
author |
Stander, Jean-Pierre |
author_sort |
Stander, Jean-Pierre |
title |
Multiscale image representation in deep learning |
title_short |
Multiscale image representation in deep learning |
title_full |
Multiscale image representation in deep learning |
title_fullStr |
Multiscale image representation in deep learning |
title_full_unstemmed |
Multiscale image representation in deep learning |
title_sort |
multiscale image representation in deep learning |
publisher |
University of Pretoria |
publishDate |
2021 |
url |
http://hdl.handle.net/2263/78037 |
work_keys_str_mv |
AT standerjeanpierre multiscaleimagerepresentationindeeplearning |
_version_ |
1719491162448855040 |