The Information Bottleneck : Connections to Other Problems, Learning and Exploration of the IB Curve

In this thesis we study the information bottleneck (IB) method. This is an informationtheoretic framework which addresses the question of what are the relevant factors of arandom variable X to explain another statistically dependent random variable Y . Thesefactors are embedded into a bottleneck var...

Full description

Bibliographic Details
Main Author:	Rodriguez Galvez, Borja
Format:	Others
Language:	English
Published:	KTH, Skolan för elektroteknik och datavetenskap (EECS) 2019
Subjects:	Engineering and Technology Teknik och teknologier
Online Access:	http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254421

id	ndltd-UPSALLA1-oai-DiVA.org-kth-254421
record_format	oai_dc
collection	NDLTD
language	English
format	Others
sources	NDLTD
topic	Engineering and Technology Teknik och teknologier
spellingShingle	Engineering and Technology Teknik och teknologier Rodriguez Galvez, Borja The Information Bottleneck : Connections to Other Problems, Learning and Exploration of the IB Curve
description	In this thesis we study the information bottleneck (IB) method. This is an informationtheoretic framework which addresses the question of what are the relevant factors of arandom variable X to explain another statistically dependent random variable Y . Thesefactors are embedded into a bottleneck variable T obeying the Markov condition Y $X $ T.The contributions of the thesis are three-fold: (i) The thesis serves as a survey onthe existing connections of the information bottleneck method with rate distortion theoryand with minimal sufficient statistics, for which we also extended the known theory byproving some unproved results and deriving new connections. (ii) The thesis also servesas a survey on the information bottleneck and learning. We recover the main results onsample bounds for learning, prove them insufficient for real-world problems and show theimportance of the recently found ties between information and generalization. Moreover,we provide with a clear intuition of why the information bottleneck is a good objectivefunction for supervised learning tasks. Furthermore, we provide with a new informationtheoretic generalization bound for linear models which, to the extent of our knowledge,is the first one which does not depend on the cardinality of the random variables. (iii)Finally, the main contribution of the thesis are the results regarding the exploration of theIB curve. The IB curve is the set of points describing the solutions of the informationbottleneck optimization in terms of compression of the inputs and explainability of theoutput. We introduce the convex IB Lagrangian, an objective function which allows us toexplore the IB curve (in contrast to the previously used IB Lagrangian). Furthermore, weprove there is a bijective mapping between the Lagrange multiplier used and the obtainedpoint in the IB curve, provided the IB curve shape is known. This means one could designthe Lagrange multiplier to obtain a desired level of compression or explainability. === I den här avhandligen studerar vi the information bottleneck method. Detta är ettinformations-teoretiskt ramvärk som tar itu med vilka som är de relevanta faktorerna av enstokastisk variabel X som förklarar en annan, statistiskt beroende, stokastisk variabel Y .Dessa faktorer är inbäddade i en bottleneck variable T, vilken uppfyller Markov-villkoretY $ X $ T.Bidraget av denna avhandling är trefaldigt: (i)Avhandlingen fungerar som en undersökningav existerande kopplingar mellan information bottleneck method och rate distortiontheory samt minimal sufficient statistics. Vi utökar den kända teorin om dessa kopplingargenom att bevisa nya resultat och härleda nya kopplingar. (ii) Avhandlingen fungerar ocksåsom en undersökning av information bottleneck and learning. Vi återfår huvudresultatenom sample bounds for learning, bevisar att de är otillräckliga för moderna problem och visarvikten av de nyligen funna kopplingarna mellan information och generalisering. Vi presenterardessutom en intuition för varför the information bottleneck är en bra målfunktionför supervised learning. Dessutom så hittar vi en ny information-teoretisk generaliseringsgränsför linjära modeller som, så vitt vi vet, är den första sådana som inte beror på kardinalitetenav den stokastiska variabeln. (iii) Slutligen, avhandligens huvudsakliga bidragär resultat angående utforskningen av IB-kurvan. IB-kurvan är mängden av punkter sombeskriver lösningarna av information bottleneck optimiseringen i form av kompression avinsignalerna och förklarlighet av utsignalerna. Vi introducerar the convex IB Lagrangian,en målfunktion som låter oss utforska IB-kurvan (till skillnad från den tidigare användaIB Lagrangian). Vi bevisar dessutom att det finns en bijective mapping mellan de användalagrangemultiplikatorerna och den erhållna punkten på IB-kurvan, så vida IB-kurvansform är känd. Detta innebär att det är möjligt att konstruera lagrangemultiplikatorn så attman för en önskad nivå på kompression och förklarlighet.
author	Rodriguez Galvez, Borja
author_facet	Rodriguez Galvez, Borja
author_sort	Rodriguez Galvez, Borja
title	The Information Bottleneck : Connections to Other Problems, Learning and Exploration of the IB Curve
title_short	The Information Bottleneck : Connections to Other Problems, Learning and Exploration of the IB Curve
title_full	The Information Bottleneck : Connections to Other Problems, Learning and Exploration of the IB Curve
title_fullStr	The Information Bottleneck : Connections to Other Problems, Learning and Exploration of the IB Curve
title_full_unstemmed	The Information Bottleneck : Connections to Other Problems, Learning and Exploration of the IB Curve
title_sort	information bottleneck : connections to other problems, learning and exploration of the ib curve
publisher	KTH, Skolan för elektroteknik och datavetenskap (EECS)
publishDate	2019
url	http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254421
work_keys_str_mv	AT rodriguezgalvezborja theinformationbottleneckconnectionstootherproblemslearningandexplorationoftheibcurve AT rodriguezgalvezborja informationbottleneckconnectionstootherproblemslearningandexplorationoftheibcurve
_version_	1719214666328047616
spelling	ndltd-UPSALLA1-oai-DiVA.org-kth-2544212019-06-28T09:53:01ZThe Information Bottleneck : Connections to Other Problems, Learning and Exploration of the IB CurveengRodriguez Galvez, BorjaKTH, Skolan för elektroteknik och datavetenskap (EECS)2019Engineering and TechnologyTeknik och teknologierIn this thesis we study the information bottleneck (IB) method. This is an informationtheoretic framework which addresses the question of what are the relevant factors of arandom variable X to explain another statistically dependent random variable Y . Thesefactors are embedded into a bottleneck variable T obeying the Markov condition Y $X $ T.The contributions of the thesis are three-fold: (i) The thesis serves as a survey onthe existing connections of the information bottleneck method with rate distortion theoryand with minimal sufficient statistics, for which we also extended the known theory byproving some unproved results and deriving new connections. (ii) The thesis also servesas a survey on the information bottleneck and learning. We recover the main results onsample bounds for learning, prove them insufficient for real-world problems and show theimportance of the recently found ties between information and generalization. Moreover,we provide with a clear intuition of why the information bottleneck is a good objectivefunction for supervised learning tasks. Furthermore, we provide with a new informationtheoretic generalization bound for linear models which, to the extent of our knowledge,is the first one which does not depend on the cardinality of the random variables. (iii)Finally, the main contribution of the thesis are the results regarding the exploration of theIB curve. The IB curve is the set of points describing the solutions of the informationbottleneck optimization in terms of compression of the inputs and explainability of theoutput. We introduce the convex IB Lagrangian, an objective function which allows us toexplore the IB curve (in contrast to the previously used IB Lagrangian). Furthermore, weprove there is a bijective mapping between the Lagrange multiplier used and the obtainedpoint in the IB curve, provided the IB curve shape is known. This means one could designthe Lagrange multiplier to obtain a desired level of compression or explainability. I den här avhandligen studerar vi the information bottleneck method. Detta är ettinformations-teoretiskt ramvärk som tar itu med vilka som är de relevanta faktorerna av enstokastisk variabel X som förklarar en annan, statistiskt beroende, stokastisk variabel Y .Dessa faktorer är inbäddade i en bottleneck variable T, vilken uppfyller Markov-villkoretY $ X $ T.Bidraget av denna avhandling är trefaldigt: (i)Avhandlingen fungerar som en undersökningav existerande kopplingar mellan information bottleneck method och rate distortiontheory samt minimal sufficient statistics. Vi utökar den kända teorin om dessa kopplingargenom att bevisa nya resultat och härleda nya kopplingar. (ii) Avhandlingen fungerar ocksåsom en undersökning av information bottleneck and learning. Vi återfår huvudresultatenom sample bounds for learning, bevisar att de är otillräckliga för moderna problem och visarvikten av de nyligen funna kopplingarna mellan information och generalisering. Vi presenterardessutom en intuition för varför the information bottleneck är en bra målfunktionför supervised learning. Dessutom så hittar vi en ny information-teoretisk generaliseringsgränsför linjära modeller som, så vitt vi vet, är den första sådana som inte beror på kardinalitetenav den stokastiska variabeln. (iii) Slutligen, avhandligens huvudsakliga bidragär resultat angående utforskningen av IB-kurvan. IB-kurvan är mängden av punkter sombeskriver lösningarna av information bottleneck optimiseringen i form av kompression avinsignalerna och förklarlighet av utsignalerna. Vi introducerar the convex IB Lagrangian,en målfunktion som låter oss utforska IB-kurvan (till skillnad från den tidigare användaIB Lagrangian). Vi bevisar dessutom att det finns en bijective mapping mellan de användalagrangemultiplikatorerna och den erhållna punkten på IB-kurvan, så vida IB-kurvansform är känd. Detta innebär att det är möjligt att konstruera lagrangemultiplikatorn så attman för en önskad nivå på kompression och förklarlighet. Student thesisinfo:eu-repo/semantics/bachelorThesistexthttp://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-254421TRITA-EECS-EX ; 2019:296application/pdfinfo:eu-repo/semantics/openAccess

The Information Bottleneck : Connections to Other Problems, Learning and Exploration of the IB Curve

Similar Items