Summary: | In recent years, several designs that use in-memory processing to accelerate machine-learning inference problems have been proposed. Such designs are also a perfect fit for discrete, dynamic, and distributed systems that can solve large-dimensional optimization problems using iterative algorithms. For in-memory computations, ferroelectric field-effect transistors (FerroFETs) owing to their compact area and distinguishable multiple states offer promising possibilities. We present a distributed architecture that uses FerroFET memory and implements in-memory processing to solve a template problem of least squares minimization. Through this architecture, we demonstrate an improvement of 21× in energy efficiency and 3× in compute time compared to a static random access memory (SRAM)-based processing-inmemory (PIM) architecture.
|