Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation

Spiking neural networks (SNNs) offer a promising alternative to current artificial neural networks to enable low-power event-driven neuromorphic hardware. Spike-based neuromorphic applications require processing and extracting meaningful information from spatio-temporal data, represented as series o...

Full description

Bibliographic Details
Main Authors: Deboleena Roy, Priyadarshini Panda, Kaushik Roy
Format: Article
Language:English
Published: Frontiers Media S.A. 2019-06-01
Series:Frontiers in Neuroscience
Subjects:
Online Access:https://www.frontiersin.org/article/10.3389/fnins.2019.00621/full
id doaj-384bccd10f07407389b78908073bdab6
record_format Article
spelling doaj-384bccd10f07407389b78908073bdab62020-11-24T21:12:02ZengFrontiers Media S.A.Frontiers in Neuroscience1662-453X2019-06-011310.3389/fnins.2019.00621437493Synthesizing Images From Spatio-Temporal Representations Using Spike-Based BackpropagationDeboleena RoyPriyadarshini PandaKaushik RoySpiking neural networks (SNNs) offer a promising alternative to current artificial neural networks to enable low-power event-driven neuromorphic hardware. Spike-based neuromorphic applications require processing and extracting meaningful information from spatio-temporal data, represented as series of spike trains over time. In this paper, we propose a method to synthesize images from multiple modalities in a spike-based environment. We use spiking auto-encoders to convert image and audio inputs into compact spatio-temporal representations that is then decoded for image synthesis. For this, we use a direct training algorithm that computes loss on the membrane potential of the output layer and back-propagates it by using a sigmoid approximation of the neuron's activation function to enable differentiability. The spiking autoencoders are benchmarked on MNIST and Fashion-MNIST and achieve very low reconstruction loss, comparable to ANNs. Then, spiking autoencoders are trained to learn meaningful spatio-temporal representations of the data, across the two modalities—audio and visual. We synthesize images from audio in a spike-based environment by first generating, and then utilizing such shared multi-modal spatio-temporal representations. Our audio to image synthesis model is tested on the task of converting TI-46 digits audio samples to MNIST images. We are able to synthesize images with high fidelity and the model achieves competitive performance against ANNs.https://www.frontiersin.org/article/10.3389/fnins.2019.00621/fullautoencodersspiking neural networksmultimodalaudio to image conversionbackpropagataon
collection DOAJ
language English
format Article
sources DOAJ
author Deboleena Roy
Priyadarshini Panda
Kaushik Roy
spellingShingle Deboleena Roy
Priyadarshini Panda
Kaushik Roy
Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation
Frontiers in Neuroscience
autoencoders
spiking neural networks
multimodal
audio to image conversion
backpropagataon
author_facet Deboleena Roy
Priyadarshini Panda
Kaushik Roy
author_sort Deboleena Roy
title Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation
title_short Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation
title_full Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation
title_fullStr Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation
title_full_unstemmed Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation
title_sort synthesizing images from spatio-temporal representations using spike-based backpropagation
publisher Frontiers Media S.A.
series Frontiers in Neuroscience
issn 1662-453X
publishDate 2019-06-01
description Spiking neural networks (SNNs) offer a promising alternative to current artificial neural networks to enable low-power event-driven neuromorphic hardware. Spike-based neuromorphic applications require processing and extracting meaningful information from spatio-temporal data, represented as series of spike trains over time. In this paper, we propose a method to synthesize images from multiple modalities in a spike-based environment. We use spiking auto-encoders to convert image and audio inputs into compact spatio-temporal representations that is then decoded for image synthesis. For this, we use a direct training algorithm that computes loss on the membrane potential of the output layer and back-propagates it by using a sigmoid approximation of the neuron's activation function to enable differentiability. The spiking autoencoders are benchmarked on MNIST and Fashion-MNIST and achieve very low reconstruction loss, comparable to ANNs. Then, spiking autoencoders are trained to learn meaningful spatio-temporal representations of the data, across the two modalities—audio and visual. We synthesize images from audio in a spike-based environment by first generating, and then utilizing such shared multi-modal spatio-temporal representations. Our audio to image synthesis model is tested on the task of converting TI-46 digits audio samples to MNIST images. We are able to synthesize images with high fidelity and the model achieves competitive performance against ANNs.
topic autoencoders
spiking neural networks
multimodal
audio to image conversion
backpropagataon
url https://www.frontiersin.org/article/10.3389/fnins.2019.00621/full
work_keys_str_mv AT deboleenaroy synthesizingimagesfromspatiotemporalrepresentationsusingspikebasedbackpropagation
AT priyadarshinipanda synthesizingimagesfromspatiotemporalrepresentationsusingspikebasedbackpropagation
AT kaushikroy synthesizingimagesfromspatiotemporalrepresentationsusingspikebasedbackpropagation
_version_ 1716751829772009472