Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks

Current trends towards on-edge computing on smart portable devices requires ultra-low power circuits to be able to make feature extraction and classification tasks of patterns. This manuscript proposes a novel approach for feature extraction operations in speech recognition/voice activity detection...

Full description

Bibliographic Details
Main Authors: Eric Gutierrez, Carlos Perez, Fernando Hernandez, Luis Hernandez
Format: Article
Language:English
Published: MDPI AG 2020-02-01
Series:Electronics
Subjects:
Online Access:https://www.mdpi.com/2079-9292/9/3/418
id doaj-25d5a66fdf1747f1a96766bcd4cb9520
record_format Article
spelling doaj-25d5a66fdf1747f1a96766bcd4cb95202020-11-25T03:19:30ZengMDPI AGElectronics2079-92922020-02-019341810.3390/electronics9030418electronics9030418Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition TasksEric Gutierrez0Carlos Perez1Fernando Hernandez2Luis Hernandez3Department of Electronics Technology, Carlos III University of Madrid, 28911 Leganes, SpainDepartment of Electronics Technology, Carlos III University of Madrid, 28911 Leganes, SpainDepartment of Electronics Technology, Carlos III University of Madrid, 28911 Leganes, SpainDepartment of Electronics Technology, Carlos III University of Madrid, 28911 Leganes, SpainCurrent trends towards on-edge computing on smart portable devices requires ultra-low power circuits to be able to make feature extraction and classification tasks of patterns. This manuscript proposes a novel approach for feature extraction operations in speech recognition/voice activity detection tasks suitable for portable devices. Whereas conventional approaches are based on either completely analog or digital structures, we propose a “hybrid” approach by means of voltage-controlled-oscillators. Our proposal makes use of a bank a band-pass filters implemented with ring-oscillators to extract the features (energy within different frequency bands) of input audio signals and digitize them. Afterwards, these data will input a digital classification stage such as a neural network. Ring-oscillators are structures with a digital nature, which makes them highly scalable with the possibility of designing them with minimum length devices. Additionally, due to their inherent phase integration, low-frequency band-pass filters can be implemented without large capacitors. Consequently, we strongly benefit from power consumption and area savings. Finally, our proposal may incorporate the analog-to-digital converter into the structure of the own features extractor circuit to make the full conversion of the raw data when triggered. This supposes a unique advantage with respect to other approaches. The architecture is described and proposed at system-level, along with behavioral simulations made to check whether the performance is the expected one or not. Then the structure is designed with a 65-nm CMOS process to estimate the power consumption and area on a silicon implementation. The results show that our solution is very promising in terms of occupied area with a competitive power consumption in comparison to other state-of-the-art solutions.https://www.mdpi.com/2079-9292/9/3/418artificial intelligencemachine learningspeech recognitionfeatures extractionvoltage-controlled-oscillatoranalog-to-digital converter
collection DOAJ
language English
format Article
sources DOAJ
author Eric Gutierrez
Carlos Perez
Fernando Hernandez
Luis Hernandez
spellingShingle Eric Gutierrez
Carlos Perez
Fernando Hernandez
Luis Hernandez
Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks
Electronics
artificial intelligence
machine learning
speech recognition
features extraction
voltage-controlled-oscillator
analog-to-digital converter
author_facet Eric Gutierrez
Carlos Perez
Fernando Hernandez
Luis Hernandez
author_sort Eric Gutierrez
title Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks
title_short Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks
title_full Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks
title_fullStr Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks
title_full_unstemmed Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks
title_sort time-encoding-based ultra-low power features extraction circuit for speech recognition tasks
publisher MDPI AG
series Electronics
issn 2079-9292
publishDate 2020-02-01
description Current trends towards on-edge computing on smart portable devices requires ultra-low power circuits to be able to make feature extraction and classification tasks of patterns. This manuscript proposes a novel approach for feature extraction operations in speech recognition/voice activity detection tasks suitable for portable devices. Whereas conventional approaches are based on either completely analog or digital structures, we propose a “hybrid” approach by means of voltage-controlled-oscillators. Our proposal makes use of a bank a band-pass filters implemented with ring-oscillators to extract the features (energy within different frequency bands) of input audio signals and digitize them. Afterwards, these data will input a digital classification stage such as a neural network. Ring-oscillators are structures with a digital nature, which makes them highly scalable with the possibility of designing them with minimum length devices. Additionally, due to their inherent phase integration, low-frequency band-pass filters can be implemented without large capacitors. Consequently, we strongly benefit from power consumption and area savings. Finally, our proposal may incorporate the analog-to-digital converter into the structure of the own features extractor circuit to make the full conversion of the raw data when triggered. This supposes a unique advantage with respect to other approaches. The architecture is described and proposed at system-level, along with behavioral simulations made to check whether the performance is the expected one or not. Then the structure is designed with a 65-nm CMOS process to estimate the power consumption and area on a silicon implementation. The results show that our solution is very promising in terms of occupied area with a competitive power consumption in comparison to other state-of-the-art solutions.
topic artificial intelligence
machine learning
speech recognition
features extraction
voltage-controlled-oscillator
analog-to-digital converter
url https://www.mdpi.com/2079-9292/9/3/418
work_keys_str_mv AT ericgutierrez timeencodingbasedultralowpowerfeaturesextractioncircuitforspeechrecognitiontasks
AT carlosperez timeencodingbasedultralowpowerfeaturesextractioncircuitforspeechrecognitiontasks
AT fernandohernandez timeencodingbasedultralowpowerfeaturesextractioncircuitforspeechrecognitiontasks
AT luishernandez timeencodingbasedultralowpowerfeaturesextractioncircuitforspeechrecognitiontasks
_version_ 1724622010610352128