A Tunable Loss Function for Robust Classification: Calibration, Landscape, and Generalization

We introduce a tunable loss function called α-loss, parameterized by α ∈ (0,∞], which interpolates between the exponential loss (α = 1/2), the log-loss (α = 1), and the 0-1 loss (α = ∞), for the machine learning...

Full description

Bibliographic Details
Main Authors:	Cava, J.K (Author), Dasarathy, G. (Author), Diaz, M. (Author), Kairouz, P. (Author), Sankar, L. (Author), Sypherd, T. (Author)
Format:	Article
Language:	English
Published:	Institute of Electrical and Electronics Engineers Inc. 2022
Subjects:	α α-loss Arimoto conditional entropy Calibration Classification (of information) Classification algorithm Classification algorithms classification-calibration Classification-calibration Conditional entropy Entropy Generalisation generalization Logistics -loss Neural networks Noise measurement Noise measurements Optimisations Optimization Privacy Quasi convexity robustness Robustness strictly local quasi-convexity Strictly local quasi-convexity
Online Access:	View Fulltext in Publisher


LEADER	02830nam a2200505Ia 4500
001	10.1109-TIT.2022.3169440
008	220510s2022 CNT 000 0 und d
020			\|a 00189448 (ISSN)
245	1	0	\|a A Tunable Loss Function for Robust Classification: Calibration, Landscape, and Generalization
260		0	\|b Institute of Electrical and Electronics Engineers Inc. \|c 2022
856			\|z View Fulltext in Publisher \|u https://doi.org/10.1109/TIT.2022.3169440
520	3		\|a We introduce a tunable loss function called α-loss, parameterized by α ∈ (0,∞], which interpolates between the exponential loss (α = 1/2), the log-loss (α = 1), and the 0-1 loss (α = ∞), for the machine learning setting of classification. Theoretically, we illustrate a fundamental connection between α-loss and Arimoto conditional entropy, verify the classificationcalibration of α-loss in order to demonstrate asymptotic optimality via Rademacher complexity generalization techniques, and build-upon a notion called strictly local quasi-convexity in order to quantitatively characterize the optimization landscape of α-loss. Practically, we perform class imbalance, robustness, and classification experiments on benchmark image datasets using convolutional-neural-networks. Our main practical conclusion is that certain tasks may benefit from tuning α-loss away from logloss (α = 1), and to this end we provide simple heuristics for the practitioner. In particular, navigating the α hyperparameter can readily provide superior model robustness to label flips (α > 1) and sensitivity to imbalanced classes (α < 1). IEEE
650	0	4	\|a α
650	0	4	\|a α-loss
650	0	4	\|a Arimoto conditional entropy
650	0	4	\|a Calibration
650	0	4	\|a Classification (of information)
650	0	4	\|a Classification algorithm
650	0	4	\|a Classification algorithms
650	0	4	\|a classification-calibration
650	0	4	\|a Classification-calibration
650	0	4	\|a Conditional entropy
650	0	4	\|a Entropy
650	0	4	\|a Generalisation
650	0	4	\|a generalization
650	0	4	\|a Logistics
650	0	4	\|a -loss
650	0	4	\|a Neural networks
650	0	4	\|a Noise measurement
650	0	4	\|a Noise measurements
650	0	4	\|a Optimisations
650	0	4	\|a Optimization
650	0	4	\|a Privacy
650	0	4	\|a Quasi convexity
650	0	4	\|a robustness
650	0	4	\|a Robustness
650	0	4	\|a strictly local quasi-convexity
650	0	4	\|a Strictly local quasi-convexity
700	1		\|a Cava, J.K. \|e author
700	1		\|a Dasarathy, G. \|e author
700	1		\|a Diaz, M. \|e author
700	1		\|a Kairouz, P. \|e author
700	1		\|a Sankar, L. \|e author
700	1		\|a Sypherd, T. \|e author
773			\|t IEEE Transactions on Information Theory

A Tunable Loss Function for Robust Classification: Calibration, Landscape, and Generalization

Similar Items