Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation
Spiking neural networks (SNNs) offer a promising alternative to current artificial neural networks to enable low-power event-driven neuromorphic hardware. Spike-based neuromorphic applications require processing and extracting meaningful information from spatio-temporal data, represented as series o...
Main Authors: | , , |
---|---|
Format: | Article |
Language: | English |
Published: |
Frontiers Media S.A.
2019-06-01
|
Series: | Frontiers in Neuroscience |
Subjects: | |
Online Access: | https://www.frontiersin.org/article/10.3389/fnins.2019.00621/full |
id |
doaj-384bccd10f07407389b78908073bdab6 |
---|---|
record_format |
Article |
spelling |
doaj-384bccd10f07407389b78908073bdab62020-11-24T21:12:02ZengFrontiers Media S.A.Frontiers in Neuroscience1662-453X2019-06-011310.3389/fnins.2019.00621437493Synthesizing Images From Spatio-Temporal Representations Using Spike-Based BackpropagationDeboleena RoyPriyadarshini PandaKaushik RoySpiking neural networks (SNNs) offer a promising alternative to current artificial neural networks to enable low-power event-driven neuromorphic hardware. Spike-based neuromorphic applications require processing and extracting meaningful information from spatio-temporal data, represented as series of spike trains over time. In this paper, we propose a method to synthesize images from multiple modalities in a spike-based environment. We use spiking auto-encoders to convert image and audio inputs into compact spatio-temporal representations that is then decoded for image synthesis. For this, we use a direct training algorithm that computes loss on the membrane potential of the output layer and back-propagates it by using a sigmoid approximation of the neuron's activation function to enable differentiability. The spiking autoencoders are benchmarked on MNIST and Fashion-MNIST and achieve very low reconstruction loss, comparable to ANNs. Then, spiking autoencoders are trained to learn meaningful spatio-temporal representations of the data, across the two modalities—audio and visual. We synthesize images from audio in a spike-based environment by first generating, and then utilizing such shared multi-modal spatio-temporal representations. Our audio to image synthesis model is tested on the task of converting TI-46 digits audio samples to MNIST images. We are able to synthesize images with high fidelity and the model achieves competitive performance against ANNs.https://www.frontiersin.org/article/10.3389/fnins.2019.00621/fullautoencodersspiking neural networksmultimodalaudio to image conversionbackpropagataon |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Deboleena Roy Priyadarshini Panda Kaushik Roy |
spellingShingle |
Deboleena Roy Priyadarshini Panda Kaushik Roy Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation Frontiers in Neuroscience autoencoders spiking neural networks multimodal audio to image conversion backpropagataon |
author_facet |
Deboleena Roy Priyadarshini Panda Kaushik Roy |
author_sort |
Deboleena Roy |
title |
Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation |
title_short |
Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation |
title_full |
Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation |
title_fullStr |
Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation |
title_full_unstemmed |
Synthesizing Images From Spatio-Temporal Representations Using Spike-Based Backpropagation |
title_sort |
synthesizing images from spatio-temporal representations using spike-based backpropagation |
publisher |
Frontiers Media S.A. |
series |
Frontiers in Neuroscience |
issn |
1662-453X |
publishDate |
2019-06-01 |
description |
Spiking neural networks (SNNs) offer a promising alternative to current artificial neural networks to enable low-power event-driven neuromorphic hardware. Spike-based neuromorphic applications require processing and extracting meaningful information from spatio-temporal data, represented as series of spike trains over time. In this paper, we propose a method to synthesize images from multiple modalities in a spike-based environment. We use spiking auto-encoders to convert image and audio inputs into compact spatio-temporal representations that is then decoded for image synthesis. For this, we use a direct training algorithm that computes loss on the membrane potential of the output layer and back-propagates it by using a sigmoid approximation of the neuron's activation function to enable differentiability. The spiking autoencoders are benchmarked on MNIST and Fashion-MNIST and achieve very low reconstruction loss, comparable to ANNs. Then, spiking autoencoders are trained to learn meaningful spatio-temporal representations of the data, across the two modalities—audio and visual. We synthesize images from audio in a spike-based environment by first generating, and then utilizing such shared multi-modal spatio-temporal representations. Our audio to image synthesis model is tested on the task of converting TI-46 digits audio samples to MNIST images. We are able to synthesize images with high fidelity and the model achieves competitive performance against ANNs. |
topic |
autoencoders spiking neural networks multimodal audio to image conversion backpropagataon |
url |
https://www.frontiersin.org/article/10.3389/fnins.2019.00621/full |
work_keys_str_mv |
AT deboleenaroy synthesizingimagesfromspatiotemporalrepresentationsusingspikebasedbackpropagation AT priyadarshinipanda synthesizingimagesfromspatiotemporalrepresentationsusingspikebasedbackpropagation AT kaushikroy synthesizingimagesfromspatiotemporalrepresentationsusingspikebasedbackpropagation |
_version_ |
1716751829772009472 |