Understanding Mixup Training Methods

Mixup is a neural network training method that generates new samples by linear interpolation of multiple samples and their labels. The mixup training method has better generalization ability than the traditional empirical risk minimization method (ERM). But there is a lack of a more intuitive unders...

Full description

Bibliographic Details
Main Authors: Daojun Liang, Feng Yang, Tian Zhang, Peter Yang
Format: Article
Language:English
Published: IEEE 2018-01-01
Series:IEEE Access
Subjects:
Online Access:https://ieeexplore.ieee.org/document/8478159/
Description
Summary:Mixup is a neural network training method that generates new samples by linear interpolation of multiple samples and their labels. The mixup training method has better generalization ability than the traditional empirical risk minimization method (ERM). But there is a lack of a more intuitive understanding of why mixup will perform better. In this paper, several different sample mixing methods are used to test how neural networks learn and infer from mixed samples to illustrate how mixups work as a data augmentation method and how it regularizes neural networks. Then, a method of weighting noise perturbation was designed to visualize the loss functions of mixup and ERM training methods to analyze the properties of their high-dimensional decision surfaces. Finally, by analyzing the mixture of samples and their labels, a spatial mixup approach was proposed that achieved the state-of-the-art performance on the CIFAR and ImageNet data sets. This method also enables the generative adversarial nets to have more stable training process and more diverse sample generation ability.
ISSN:2169-3536