Can we trust AI? towards practical implementation and theoretical analysis in trustworthy machine learning

Deep learning or deep neural networks (DNNs) have achieved extraordinary performance in many application domains such as image classification, object detection and recognition, natural language processing and medical image analysis. It has been well accepted that DNNs are vulnerable to adversarial a...

Full description

Bibliographic Details
Published:
Online Access:http://hdl.handle.net/2047/D20413930
id ndltd-NEU--neu-bz613b01w
record_format oai_dc
spelling ndltd-NEU--neu-bz613b01w2021-08-20T05:11:13ZCan we trust AI? towards practical implementation and theoretical analysis in trustworthy machine learningDeep learning or deep neural networks (DNNs) have achieved extraordinary performance in many application domains such as image classification, object detection and recognition, natural language processing and medical image analysis. It has been well accepted that DNNs are vulnerable to adversarial attacks, which raises concerns of DNNs in security-critical applications and may result in disastrous consequences. Adversarial attacks are usually implemented by generating adversarial examples, i.e., adding sophisticated perturbations onto benign examples, such that adversarial examples are classified by the DNN as target (wrong) labels instead of the correct labels of the benign examples. The adversarial machine learning aims to study this phenomenon and leverage it to build robust machine learning systems and explain DNNs. In this dissertation, we present the mechanism of adversarial machine learning in both empirical and theoretical ways. Specifically, we first introduce a uniform adversarial attack generation framework, structured attack (StrAttack), which explores group sparsity in adversarial perturbations by sliding a mask through images aiming for extracting key spatial structures. Second, we discuss the feasibility of adversarial attack in the physical world and introduce a powerful framework, Expectation over Transformation (EoT). Utilize EoT with Thin Plate Spline (TPS) transformation, we can generate Adversarial T-shirts, a robust physical adversarial example for evading person detectors even if it could undergo non-rigid deformation due to a moving person's pose changes. Third, we stand on the defense side and propose the first adversarial training method based on Graph Neural Network. Fourth, we introduce Linear relaxation based perturbation analysis (LiRPA) for neural networks, which computes the provable linear bounds of output neurons given a certain amount of input perturbation. LiRPA studies the adversarial example in a theoretical way and can guarantee the test accuracy of a model by given perturbation constraints. Finally, leveraging the efficient LiRPA with branch and bound, we speed up the conventional Linear Programming-based complete verification framework by an order of magnitude. In the future, we plan to study on a novel patch transformer network to truthfully model real-world physical transformations empirically. In addition, at the formal robustness direction, we plan to explore the complete verification in real-time, that given sufficient time, the verifier should give a definite "yes/no" answer for a property under verification efficiently. Our LiRPA framework combined with GPUs can accelerate this procedure potentially.--Author's abstracthttp://hdl.handle.net/2047/D20413930
collection NDLTD
sources NDLTD
description Deep learning or deep neural networks (DNNs) have achieved extraordinary performance in many application domains such as image classification, object detection and recognition, natural language processing and medical image analysis. It has been well accepted that DNNs are vulnerable to adversarial attacks, which raises concerns of DNNs in security-critical applications and may result in disastrous consequences. Adversarial attacks are usually implemented by generating adversarial examples, i.e., adding sophisticated perturbations onto benign examples, such that adversarial examples are classified by the DNN as target (wrong) labels instead of the correct labels of the benign examples. The adversarial machine learning aims to study this phenomenon and leverage it to build robust machine learning systems and explain DNNs. In this dissertation, we present the mechanism of adversarial machine learning in both empirical and theoretical ways. Specifically, we first introduce a uniform adversarial attack generation framework, structured attack (StrAttack), which explores group sparsity in adversarial perturbations by sliding a mask through images aiming for extracting key spatial structures. Second, we discuss the feasibility of adversarial attack in the physical world and introduce a powerful framework, Expectation over Transformation (EoT). Utilize EoT with Thin Plate Spline (TPS) transformation, we can generate Adversarial T-shirts, a robust physical adversarial example for evading person detectors even if it could undergo non-rigid deformation due to a moving person's pose changes. Third, we stand on the defense side and propose the first adversarial training method based on Graph Neural Network. Fourth, we introduce Linear relaxation based perturbation analysis (LiRPA) for neural networks, which computes the provable linear bounds of output neurons given a certain amount of input perturbation. LiRPA studies the adversarial example in a theoretical way and can guarantee the test accuracy of a model by given perturbation constraints. Finally, leveraging the efficient LiRPA with branch and bound, we speed up the conventional Linear Programming-based complete verification framework by an order of magnitude. In the future, we plan to study on a novel patch transformer network to truthfully model real-world physical transformations empirically. In addition, at the formal robustness direction, we plan to explore the complete verification in real-time, that given sufficient time, the verifier should give a definite "yes/no" answer for a property under verification efficiently. Our LiRPA framework combined with GPUs can accelerate this procedure potentially.--Author's abstract
title Can we trust AI? towards practical implementation and theoretical analysis in trustworthy machine learning
spellingShingle Can we trust AI? towards practical implementation and theoretical analysis in trustworthy machine learning
title_short Can we trust AI? towards practical implementation and theoretical analysis in trustworthy machine learning
title_full Can we trust AI? towards practical implementation and theoretical analysis in trustworthy machine learning
title_fullStr Can we trust AI? towards practical implementation and theoretical analysis in trustworthy machine learning
title_full_unstemmed Can we trust AI? towards practical implementation and theoretical analysis in trustworthy machine learning
title_sort can we trust ai? towards practical implementation and theoretical analysis in trustworthy machine learning
publishDate
url http://hdl.handle.net/2047/D20413930
_version_ 1719460734339907584