Summary: | Real-time automatic identification of audio distress signals in urban areas is a task that in a smart city can improve response times in emergency alert systems. The main challenge in this problem lies in finding a model that is able to accurately recognize these type of signals in the presence of background noise and allows for real-time processing. In this paper, we present the design of a portable and low-cost device for accurate audio distress signal recognition in real urban scenarios based on deep learning models. As real audio distress recordings in urban areas have not been collected and made publicly available so far, we first constructed a database where audios were recorded in urban areas using a low-cost microphone. Using this database, we trained a deep multi-headed 2D convolutional neural network that processed temporal and frequency features to accurately recognize audio distress signals in noisy environments with a significant performance improvement to other methods from the literature. Then, we deployed and assessed the trained convolutional neural network model on a Raspberry Pi that, along with the low-cost microphone, constituted a device for accurate real-time audio recognition. Source code and database are publicly available.
|