Learning via Query Synthesis

Active learning is a subfield of machine learning that has been successfully used in many applications. One of the main branches of active learning is query synthe- sis, where the learning agent constructs artificial queries from scratch in order to reveal sensitive information about the underlying...

Full description

Bibliographic Details
Main Author: Alabdulmohsin, Ibrahim Mansour
Other Authors: Zhang, Xiangliang
Language:en
Published: 2017
Subjects:
Online Access:http://hdl.handle.net/10754/623482
http://repository.kaust.edu.sa/kaust/handle/10754/623482
id ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-623482
record_format oai_dc
spelling ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-6234822017-05-17T04:02:10Z Learning via Query Synthesis Alabdulmohsin, Ibrahim Mansour Zhang, Xiangliang Computer, Electrical and Mathematical Sciences and Engineering (CEMSE) Division Keyes, David E. Wang, Wei Gao, Xin active learning query synthesis reverse engineering support vector machine indefinite kernels linear classification Active learning is a subfield of machine learning that has been successfully used in many applications. One of the main branches of active learning is query synthe- sis, where the learning agent constructs artificial queries from scratch in order to reveal sensitive information about the underlying decision boundary. It has found applications in areas, such as adversarial reverse engineering, automated science, and computational chemistry. Nevertheless, the existing literature on membership query synthesis has, generally, focused on finite concept classes or toy problems, with a limited extension to real-world applications. In this thesis, I develop two spectral algorithms for learning halfspaces via query synthesis. The first algorithm is a maximum-determinant convex optimization method while the second algorithm is a Markovian method that relies on Khachiyan’s classical update formulas for solving linear programs. The general theme of these methods is to construct an ellipsoidal approximation of the version space and to synthesize queries, afterward, via spectral decomposition. Moreover, I also describe how these algorithms can be extended to other settings as well, such as pool-based active learning. Having demonstrated that halfspaces can be learned quite efficiently via query synthesis, the second part of this thesis proposes strategies for mitigating the risk of reverse engineering in adversarial environments. One approach that can be used to render query synthesis algorithms ineffective is to implement a randomized response. In this thesis, I propose a semidefinite program (SDP) for learning a distribution of classifiers, subject to the constraint that any individual classifier picked at random from this distributions provides reliable predictions with a high probability. This algorithm is, then, justified both theoretically and empirically. A second approach is to use a non-parametric classification method, such as similarity-based classification. In this thesis, I argue that learning via the empirical kernel maps, also commonly referred to as 1-norm Support Vector Machine (SVM) or Linear Programming (LP) SVM, is the best method for handling indefinite similarities. The advantages of this method are established both theoretically and empirically. 2017-05-07 Dissertation http://hdl.handle.net/10754/623482 http://repository.kaust.edu.sa/kaust/handle/10754/623482 en
collection NDLTD
language en
sources NDLTD
topic active learning
query synthesis
reverse engineering
support vector machine
indefinite kernels
linear classification
spellingShingle active learning
query synthesis
reverse engineering
support vector machine
indefinite kernels
linear classification
Alabdulmohsin, Ibrahim Mansour
Learning via Query Synthesis
description Active learning is a subfield of machine learning that has been successfully used in many applications. One of the main branches of active learning is query synthe- sis, where the learning agent constructs artificial queries from scratch in order to reveal sensitive information about the underlying decision boundary. It has found applications in areas, such as adversarial reverse engineering, automated science, and computational chemistry. Nevertheless, the existing literature on membership query synthesis has, generally, focused on finite concept classes or toy problems, with a limited extension to real-world applications. In this thesis, I develop two spectral algorithms for learning halfspaces via query synthesis. The first algorithm is a maximum-determinant convex optimization method while the second algorithm is a Markovian method that relies on Khachiyan’s classical update formulas for solving linear programs. The general theme of these methods is to construct an ellipsoidal approximation of the version space and to synthesize queries, afterward, via spectral decomposition. Moreover, I also describe how these algorithms can be extended to other settings as well, such as pool-based active learning. Having demonstrated that halfspaces can be learned quite efficiently via query synthesis, the second part of this thesis proposes strategies for mitigating the risk of reverse engineering in adversarial environments. One approach that can be used to render query synthesis algorithms ineffective is to implement a randomized response. In this thesis, I propose a semidefinite program (SDP) for learning a distribution of classifiers, subject to the constraint that any individual classifier picked at random from this distributions provides reliable predictions with a high probability. This algorithm is, then, justified both theoretically and empirically. A second approach is to use a non-parametric classification method, such as similarity-based classification. In this thesis, I argue that learning via the empirical kernel maps, also commonly referred to as 1-norm Support Vector Machine (SVM) or Linear Programming (LP) SVM, is the best method for handling indefinite similarities. The advantages of this method are established both theoretically and empirically.
author2 Zhang, Xiangliang
author_facet Zhang, Xiangliang
Alabdulmohsin, Ibrahim Mansour
author Alabdulmohsin, Ibrahim Mansour
author_sort Alabdulmohsin, Ibrahim Mansour
title Learning via Query Synthesis
title_short Learning via Query Synthesis
title_full Learning via Query Synthesis
title_fullStr Learning via Query Synthesis
title_full_unstemmed Learning via Query Synthesis
title_sort learning via query synthesis
publishDate 2017
url http://hdl.handle.net/10754/623482
http://repository.kaust.edu.sa/kaust/handle/10754/623482
work_keys_str_mv AT alabdulmohsinibrahimmansour learningviaquerysynthesis
_version_ 1718449482308255744