Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks
Current trends towards on-edge computing on smart portable devices requires ultra-low power circuits to be able to make feature extraction and classification tasks of patterns. This manuscript proposes a novel approach for feature extraction operations in speech recognition/voice activity detection...
Main Authors: | , , , |
---|---|
Format: | Article |
Language: | English |
Published: |
MDPI AG
2020-02-01
|
Series: | Electronics |
Subjects: | |
Online Access: | https://www.mdpi.com/2079-9292/9/3/418 |
id |
doaj-25d5a66fdf1747f1a96766bcd4cb9520 |
---|---|
record_format |
Article |
spelling |
doaj-25d5a66fdf1747f1a96766bcd4cb95202020-11-25T03:19:30ZengMDPI AGElectronics2079-92922020-02-019341810.3390/electronics9030418electronics9030418Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition TasksEric Gutierrez0Carlos Perez1Fernando Hernandez2Luis Hernandez3Department of Electronics Technology, Carlos III University of Madrid, 28911 Leganes, SpainDepartment of Electronics Technology, Carlos III University of Madrid, 28911 Leganes, SpainDepartment of Electronics Technology, Carlos III University of Madrid, 28911 Leganes, SpainDepartment of Electronics Technology, Carlos III University of Madrid, 28911 Leganes, SpainCurrent trends towards on-edge computing on smart portable devices requires ultra-low power circuits to be able to make feature extraction and classification tasks of patterns. This manuscript proposes a novel approach for feature extraction operations in speech recognition/voice activity detection tasks suitable for portable devices. Whereas conventional approaches are based on either completely analog or digital structures, we propose a “hybrid” approach by means of voltage-controlled-oscillators. Our proposal makes use of a bank a band-pass filters implemented with ring-oscillators to extract the features (energy within different frequency bands) of input audio signals and digitize them. Afterwards, these data will input a digital classification stage such as a neural network. Ring-oscillators are structures with a digital nature, which makes them highly scalable with the possibility of designing them with minimum length devices. Additionally, due to their inherent phase integration, low-frequency band-pass filters can be implemented without large capacitors. Consequently, we strongly benefit from power consumption and area savings. Finally, our proposal may incorporate the analog-to-digital converter into the structure of the own features extractor circuit to make the full conversion of the raw data when triggered. This supposes a unique advantage with respect to other approaches. The architecture is described and proposed at system-level, along with behavioral simulations made to check whether the performance is the expected one or not. Then the structure is designed with a 65-nm CMOS process to estimate the power consumption and area on a silicon implementation. The results show that our solution is very promising in terms of occupied area with a competitive power consumption in comparison to other state-of-the-art solutions.https://www.mdpi.com/2079-9292/9/3/418artificial intelligencemachine learningspeech recognitionfeatures extractionvoltage-controlled-oscillatoranalog-to-digital converter |
collection |
DOAJ |
language |
English |
format |
Article |
sources |
DOAJ |
author |
Eric Gutierrez Carlos Perez Fernando Hernandez Luis Hernandez |
spellingShingle |
Eric Gutierrez Carlos Perez Fernando Hernandez Luis Hernandez Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks Electronics artificial intelligence machine learning speech recognition features extraction voltage-controlled-oscillator analog-to-digital converter |
author_facet |
Eric Gutierrez Carlos Perez Fernando Hernandez Luis Hernandez |
author_sort |
Eric Gutierrez |
title |
Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks |
title_short |
Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks |
title_full |
Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks |
title_fullStr |
Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks |
title_full_unstemmed |
Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks |
title_sort |
time-encoding-based ultra-low power features extraction circuit for speech recognition tasks |
publisher |
MDPI AG |
series |
Electronics |
issn |
2079-9292 |
publishDate |
2020-02-01 |
description |
Current trends towards on-edge computing on smart portable devices requires ultra-low power circuits to be able to make feature extraction and classification tasks of patterns. This manuscript proposes a novel approach for feature extraction operations in speech recognition/voice activity detection tasks suitable for portable devices. Whereas conventional approaches are based on either completely analog or digital structures, we propose a “hybrid” approach by means of voltage-controlled-oscillators. Our proposal makes use of a bank a band-pass filters implemented with ring-oscillators to extract the features (energy within different frequency bands) of input audio signals and digitize them. Afterwards, these data will input a digital classification stage such as a neural network. Ring-oscillators are structures with a digital nature, which makes them highly scalable with the possibility of designing them with minimum length devices. Additionally, due to their inherent phase integration, low-frequency band-pass filters can be implemented without large capacitors. Consequently, we strongly benefit from power consumption and area savings. Finally, our proposal may incorporate the analog-to-digital converter into the structure of the own features extractor circuit to make the full conversion of the raw data when triggered. This supposes a unique advantage with respect to other approaches. The architecture is described and proposed at system-level, along with behavioral simulations made to check whether the performance is the expected one or not. Then the structure is designed with a 65-nm CMOS process to estimate the power consumption and area on a silicon implementation. The results show that our solution is very promising in terms of occupied area with a competitive power consumption in comparison to other state-of-the-art solutions. |
topic |
artificial intelligence machine learning speech recognition features extraction voltage-controlled-oscillator analog-to-digital converter |
url |
https://www.mdpi.com/2079-9292/9/3/418 |
work_keys_str_mv |
AT ericgutierrez timeencodingbasedultralowpowerfeaturesextractioncircuitforspeechrecognitiontasks AT carlosperez timeencodingbasedultralowpowerfeaturesextractioncircuitforspeechrecognitiontasks AT fernandohernandez timeencodingbasedultralowpowerfeaturesextractioncircuitforspeechrecognitiontasks AT luishernandez timeencodingbasedultralowpowerfeaturesextractioncircuitforspeechrecognitiontasks |
_version_ |
1724622010610352128 |