Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks

Current trends towards on-edge computing on smart portable devices requires ultra-low power circuits to be able to make feature extraction and classification tasks of patterns. This manuscript proposes a novel approach for feature extraction operations in speech recognition/voice activity detection...

Full description

Bibliographic Details
Main Authors:	Eric Gutierrez, Carlos Perez, Fernando Hernandez, Luis Hernandez
Format:	Article
Language:	English
Published:	MDPI AG 2020-02-01
Series:	Electronics
Subjects:	artificial intelligence machine learning speech recognition features extraction voltage-controlled-oscillator analog-to-digital converter
Online Access:	https://www.mdpi.com/2079-9292/9/3/418

id	doaj-25d5a66fdf1747f1a96766bcd4cb9520
record_format	Article
spelling	doaj-25d5a66fdf1747f1a96766bcd4cb95202020-11-25T03:19:30ZengMDPI AGElectronics2079-92922020-02-019341810.3390/electronics9030418electronics9030418Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition TasksEric Gutierrez0Carlos Perez1Fernando Hernandez2Luis Hernandez3Department of Electronics Technology, Carlos III University of Madrid, 28911 Leganes, SpainDepartment of Electronics Technology, Carlos III University of Madrid, 28911 Leganes, SpainDepartment of Electronics Technology, Carlos III University of Madrid, 28911 Leganes, SpainDepartment of Electronics Technology, Carlos III University of Madrid, 28911 Leganes, SpainCurrent trends towards on-edge computing on smart portable devices requires ultra-low power circuits to be able to make feature extraction and classification tasks of patterns. This manuscript proposes a novel approach for feature extraction operations in speech recognition/voice activity detection tasks suitable for portable devices. Whereas conventional approaches are based on either completely analog or digital structures, we propose a “hybrid” approach by means of voltage-controlled-oscillators. Our proposal makes use of a bank a band-pass filters implemented with ring-oscillators to extract the features (energy within different frequency bands) of input audio signals and digitize them. Afterwards, these data will input a digital classification stage such as a neural network. Ring-oscillators are structures with a digital nature, which makes them highly scalable with the possibility of designing them with minimum length devices. Additionally, due to their inherent phase integration, low-frequency band-pass filters can be implemented without large capacitors. Consequently, we strongly benefit from power consumption and area savings. Finally, our proposal may incorporate the analog-to-digital converter into the structure of the own features extractor circuit to make the full conversion of the raw data when triggered. This supposes a unique advantage with respect to other approaches. The architecture is described and proposed at system-level, along with behavioral simulations made to check whether the performance is the expected one or not. Then the structure is designed with a 65-nm CMOS process to estimate the power consumption and area on a silicon implementation. The results show that our solution is very promising in terms of occupied area with a competitive power consumption in comparison to other state-of-the-art solutions.https://www.mdpi.com/2079-9292/9/3/418artificial intelligencemachine learningspeech recognitionfeatures extractionvoltage-controlled-oscillatoranalog-to-digital converter
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Eric Gutierrez Carlos Perez Fernando Hernandez Luis Hernandez
spellingShingle	Eric Gutierrez Carlos Perez Fernando Hernandez Luis Hernandez Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks Electronics artificial intelligence machine learning speech recognition features extraction voltage-controlled-oscillator analog-to-digital converter
author_facet	Eric Gutierrez Carlos Perez Fernando Hernandez Luis Hernandez
author_sort	Eric Gutierrez
title	Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks
title_short	Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks
title_full	Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks
title_fullStr	Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks
title_full_unstemmed	Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks
title_sort	time-encoding-based ultra-low power features extraction circuit for speech recognition tasks
publisher	MDPI AG
series	Electronics
issn	2079-9292
publishDate	2020-02-01
description	Current trends towards on-edge computing on smart portable devices requires ultra-low power circuits to be able to make feature extraction and classification tasks of patterns. This manuscript proposes a novel approach for feature extraction operations in speech recognition/voice activity detection tasks suitable for portable devices. Whereas conventional approaches are based on either completely analog or digital structures, we propose a “hybrid” approach by means of voltage-controlled-oscillators. Our proposal makes use of a bank a band-pass filters implemented with ring-oscillators to extract the features (energy within different frequency bands) of input audio signals and digitize them. Afterwards, these data will input a digital classification stage such as a neural network. Ring-oscillators are structures with a digital nature, which makes them highly scalable with the possibility of designing them with minimum length devices. Additionally, due to their inherent phase integration, low-frequency band-pass filters can be implemented without large capacitors. Consequently, we strongly benefit from power consumption and area savings. Finally, our proposal may incorporate the analog-to-digital converter into the structure of the own features extractor circuit to make the full conversion of the raw data when triggered. This supposes a unique advantage with respect to other approaches. The architecture is described and proposed at system-level, along with behavioral simulations made to check whether the performance is the expected one or not. Then the structure is designed with a 65-nm CMOS process to estimate the power consumption and area on a silicon implementation. The results show that our solution is very promising in terms of occupied area with a competitive power consumption in comparison to other state-of-the-art solutions.
topic	artificial intelligence machine learning speech recognition features extraction voltage-controlled-oscillator analog-to-digital converter
url	https://www.mdpi.com/2079-9292/9/3/418
work_keys_str_mv	AT ericgutierrez timeencodingbasedultralowpowerfeaturesextractioncircuitforspeechrecognitiontasks AT carlosperez timeencodingbasedultralowpowerfeaturesextractioncircuitforspeechrecognitiontasks AT fernandohernandez timeencodingbasedultralowpowerfeaturesextractioncircuitforspeechrecognitiontasks AT luishernandez timeencodingbasedultralowpowerfeaturesextractioncircuitforspeechrecognitiontasks
_version_	1724622010610352128

Time-Encoding-Based Ultra-Low Power Features Extraction Circuit for Speech Recognition Tasks

Similar Items