Neural Network Training With Asymmetric Crosspoint Elements

<jats:p>Analog crossbar arrays comprising programmable non-volatile resistors are under intense investigation for acceleration of deep neural network training. However, the ubiquitous asymmetric conductance modulation of practical resistive devices critically degrades the classification perfor...

Full description

Bibliographic Details
Main Authors: Onen, Murat (Author), Gokmen, Tayfun (Author), Todorov, Teodor K (Author), Nowicki, Tomasz (Author), del Alamo, Jesús A (Author), Rozen, John (Author), Haensch, Wilfried (Author), Kim, Seyoung (Author)
Format: Article
Language:English
Published: Frontiers Media SA, 2022-06-14T18:25:27Z.
Subjects:
Online Access:Get fulltext
LEADER 01864 am a22002413u 4500
001 143120
042 |a dc 
100 1 0 |a Onen, Murat  |e author 
700 1 0 |a Gokmen, Tayfun  |e author 
700 1 0 |a Todorov, Teodor K  |e author 
700 1 0 |a Nowicki, Tomasz  |e author 
700 1 0 |a del Alamo, Jesús A  |e author 
700 1 0 |a Rozen, John  |e author 
700 1 0 |a Haensch, Wilfried  |e author 
700 1 0 |a Kim, Seyoung  |e author 
245 0 0 |a Neural Network Training With Asymmetric Crosspoint Elements 
260 |b Frontiers Media SA,   |c 2022-06-14T18:25:27Z. 
856 |z Get fulltext  |u https://hdl.handle.net/1721.1/143120 
520 |a <jats:p>Analog crossbar arrays comprising programmable non-volatile resistors are under intense investigation for acceleration of deep neural network training. However, the ubiquitous asymmetric conductance modulation of practical resistive devices critically degrades the classification performance of networks trained with conventional algorithms. Here we first describe the fundamental reasons behind this incompatibility. Then, we explain the theoretical underpinnings of a novel fully-parallel training algorithm that is compatible with asymmetric crosspoint elements. By establishing a powerful analogy with classical mechanics, we explain how device asymmetry can be exploited as a useful feature for analog deep learning processors. Instead of conventionally tuning weights in the direction of the error function gradient, network parameters can be programmed to successfully minimize the total energy (Hamiltonian) of the system that incorporates the effects of device asymmetry. Our technique enables immediate realization of analog deep learning accelerators based on readily available device technologies.</jats:p> 
546 |a en 
655 7 |a Article 
773 |t 10.3389/frai.2022.891624 
773 |t Frontiers in Artificial Intelligence