TwinNet: A Double Sub-Network Framework for Detecting Universal Adversarial Perturbations

Deep neural network has achieved great progress on tasks involving complex abstract concepts. However, there exist adversarial perturbations, which are imperceptible to humans, which can tremendously undermine the performance of deep neural network classifiers. Moreover, universal adversarial pertur...

Full description

Bibliographic Details
Main Authors: Yibin Ruan, Jiazhu Dai
Format: Article
Language:English
Published: MDPI AG 2018-03-01
Series:Future Internet
Subjects:
PCA
Online Access:http://www.mdpi.com/1999-5903/10/3/26
Description
Summary:Deep neural network has achieved great progress on tasks involving complex abstract concepts. However, there exist adversarial perturbations, which are imperceptible to humans, which can tremendously undermine the performance of deep neural network classifiers. Moreover, universal adversarial perturbations can even fool classifiers on almost all examples with just a single perturbation vector. In this paper, we propose TwinNet, a framework for neural network classifiers to detect such adversarial perturbations. TwinNet makes no modification of the protected classifier. It detects adversarially perturbated examples by enhancing different types of features in dedicated networks and fusing the output of the networks later. The paper empirically shows that our framework can identify adversarial perturbations effectively with a slight loss in accuracy when predicting normal examples, which outperforms state-of-the-art works.
ISSN:1999-5903