Summary: | We live in the Information Age. In this age technological industry allows individuals to explore their personalized needs, therefore simplifying the procedure of making decisions. It also allows big global market players to leverage amounts of information they collect over time in order to excel in the markets they are operating in. Huge and often incomprehensive volumes of information collected to date constitute the phenomenon of Big Data. Big Data is a term used to describe datasets that are not suitable for processing by traditional software. To date, the commonly used way to get value out of Big Data is to employ a wide range of machine learning techniques. Machine learning is genuinely data-driven. The more data are available the better, from statistical point of view. This enables creation of an existing range of applications for broad spectrum of modeling and predictive tasks. Traditional methods of machine learning (e.g. linear models) are easy to implement and give computationally cheap solutions. These solutions, however, are not always capable to capture the underlaying complexity of Big Data. More sophisticated approaches (e.g. Convolution Neural Networks in computer vision) are show empirically to be reliable, but this reliability bears high computational costs. A natural way to overcome this obstacle appears to be reduction of Data Volume (the number of factors, attributes and records). Doing so, however, is an extremely tedious and non-trivial task itself. In this thesis we show that, thanks to well-known concentration of measure effect, it is often beneficial to keep the dimensionality of the problem high and use it to your own advantage. Measure concentration effect is a phenomenon that can only be found in high dimensional spaces. One of theoretical findings of this thesis is that using measure concentration effect allows one to correct individual mistakes of Artificial Intelligence (AI) systems in a cheap and non-intrusive way. Specifically we show how to correct AI systems errors with linear functional while not changing their inner decision making processes. As an illustration of how one can benefit from this we have developed Knowledge Transfer framework for legacy AI systems. The development of this framework is also an answer to a fundamental question: how a legacy "student" AI system could learn from "teacher" AI system without complete retraining. Theoretical findings are illustrated with several case studies in the area computer vision.
|