Benign interpolation of noise in deep learning

The understanding of generalisation in machine learning is in a state of flux, in part due to the ability of deep learning models to interpolate noisy training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about the bias-variance tradeoff in l...

Full description

Bibliographic Details
Main Authors:	Marthinus Wilhelmus Theunissen, Marelie Davel, Etienne Barnard
Format:	Article
Language:	English
Published:	South African Institute of Computer Scientists and Information Technologists 2020-12-01
Series:	South African Computer Journal
Online Access:	https://sacj.cs.uct.ac.za/index.php/sacj/article/view/833

id	doaj-40742585d9c346c899f757792430a34e
record_format	Article
spelling	doaj-40742585d9c346c899f757792430a34e2020-12-08T07:46:49ZengSouth African Institute of Computer Scientists and Information TechnologistsSouth African Computer Journal1015-79992313-78352020-12-0132210.18489/sacj.v32i2.833748Benign interpolation of noise in deep learningMarthinus Wilhelmus Theunissen0https://orcid.org/0000-0002-7456-7769Marelie Davel1https://orcid.org/0000-0003-3103-5858Etienne Barnard2https://orcid.org/0000-0003-2202-2369Multilingual Speech Technologies, North-West University, South AfricaMultilingual Speech Technologies, North-West University, South AfricaMultilingual Speech Technologies, North-West University, South AfricaThe understanding of generalisation in machine learning is in a state of flux, in part due to the ability of deep learning models to interpolate noisy training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about the bias-variance tradeoff in learning. We expand upon relevant existing work by discussing local attributes of neural network training within the context of a relatively simple framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the deep learning model to generalise in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments involving overparameterised multilayer perceptrons and controlled training data noise. The main insights are that deep learning models are optimised for training data modularly, with different regions in the function space dedicated to fitting distinct types of sample information. Additionally, we show that models tend to fit uncorrupted samples first. Based on this finding, we propose a conjecture to explain an observed instance of the epoch-wise double-descent phenomenon. Our findings suggest that the notion of model capacity needs to be modified to consider the distributed way training data is fitted across sub-units.https://sacj.cs.uct.ac.za/index.php/sacj/article/view/833
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Marthinus Wilhelmus Theunissen Marelie Davel Etienne Barnard
spellingShingle	Marthinus Wilhelmus Theunissen Marelie Davel Etienne Barnard Benign interpolation of noise in deep learning South African Computer Journal
author_facet	Marthinus Wilhelmus Theunissen Marelie Davel Etienne Barnard
author_sort	Marthinus Wilhelmus Theunissen
title	Benign interpolation of noise in deep learning
title_short	Benign interpolation of noise in deep learning
title_full	Benign interpolation of noise in deep learning
title_fullStr	Benign interpolation of noise in deep learning
title_full_unstemmed	Benign interpolation of noise in deep learning
title_sort	benign interpolation of noise in deep learning
publisher	South African Institute of Computer Scientists and Information Technologists
series	South African Computer Journal
issn	1015-7999 2313-7835
publishDate	2020-12-01
description	The understanding of generalisation in machine learning is in a state of flux, in part due to the ability of deep learning models to interpolate noisy training data and still perform appropriately on out-of-sample data, thereby contradicting long-held intuitions about the bias-variance tradeoff in learning. We expand upon relevant existing work by discussing local attributes of neural network training within the context of a relatively simple framework. We describe how various types of noise can be compensated for within the proposed framework in order to allow the deep learning model to generalise in spite of interpolating spurious function descriptors. Empirically, we support our postulates with experiments involving overparameterised multilayer perceptrons and controlled training data noise. The main insights are that deep learning models are optimised for training data modularly, with different regions in the function space dedicated to fitting distinct types of sample information. Additionally, we show that models tend to fit uncorrupted samples first. Based on this finding, we propose a conjecture to explain an observed instance of the epoch-wise double-descent phenomenon. Our findings suggest that the notion of model capacity needs to be modified to consider the distributed way training data is fitted across sub-units.
url	https://sacj.cs.uct.ac.za/index.php/sacj/article/view/833
work_keys_str_mv	AT marthinuswilhelmustheunissen benigninterpolationofnoiseindeeplearning AT mareliedavel benigninterpolationofnoiseindeeplearning AT etiennebarnard benigninterpolationofnoiseindeeplearning
_version_	1724391158601220096

Benign interpolation of noise in deep learning

Similar Items