Multi-Modal Cross Learning for an FMCW Radar Assisted by Thermal and RGB Cameras to Monitor Gestures and Cooking Processes

This paper proposes a multi-modal cross learning approach to augment the neural network training phase by additional sensor data. The approach is multi-modal during training (i.e., radar Range-Doppler maps, thermal camera images, and RGB camera images are used for training). In inference, the approa...

Full description

Bibliographic Details
Main Authors:	Marco Altmann, Peter Ott, Nicolaj C. Stache, Christian Waldschmidt
Format:	Article
Language:	English
Published:	IEEE 2021-01-01
Series:	IEEE Access
Subjects:	Machine learning neural networks radar applications multimodal sensors cross learning autoencoder
Online Access:	https://ieeexplore.ieee.org/document/9345685/

id	doaj-5ee8db7184d44cbd97af24b39d9038f2
record_format	Article
spelling	doaj-5ee8db7184d44cbd97af24b39d9038f22021-03-30T15:06:45ZengIEEEIEEE Access2169-35362021-01-019222952230310.1109/ACCESS.2021.30568789345685Multi-Modal Cross Learning for an FMCW Radar Assisted by Thermal and RGB Cameras to Monitor Gestures and Cooking ProcessesMarco Altmann0https://orcid.org/0000-0001-7118-209XPeter Ott1https://orcid.org/0000-0003-3513-4167Nicolaj C. Stache2https://orcid.org/0000-0002-6308-0146Christian Waldschmidt3https://orcid.org/0000-0003-2090-6136Institute of Automotive Engineering and Mechatronics, Heilbronn University of Applied Sciences, Heilbronn, GermanyInstitute of Automotive Engineering and Mechatronics, Heilbronn University of Applied Sciences, Heilbronn, GermanyInstitute of Automotive Engineering and Mechatronics, Heilbronn University of Applied Sciences, Heilbronn, GermanyInstitute of Microwave Engineering, Ulm University, Ulm, GermanyThis paper proposes a multi-modal cross learning approach to augment the neural network training phase by additional sensor data. The approach is multi-modal during training (i.e., radar Range-Doppler maps, thermal camera images, and RGB camera images are used for training). In inference, the approach is single-modal (i.e., only radar Range-Doppler maps are needed for classification). The proposed approach uses a multi-modal autoencoder training which creates a compressed data representation containing correlated features across modalities. The encoder part is then used as a pretrained network for the classification task. The benefits are that expensive sensors like high resolution thermal cameras are not needed in the application but a higher classification accuracy is achieved because of the multi-modal cross learning during training. The autoencoders can also be used to generate hallucinated data of the absent sensors. The hallucinated data can be used for user interfaces, a further classification, or other tasks. The proposed approach is verified within a simultaneous cooking process classification, 2 × 2 cooktop occupancy detection, and gesture recognition task. The main functionality is an overboil protection and gesture control of a 2 × 2 cooktop. The multi-modal cross learning approach considerably outperforms single-modal approaches on that challenging classification task.https://ieeexplore.ieee.org/document/9345685/Machine learningneural networksradar applicationsmultimodal sensorscross learningautoencoder
collection	DOAJ
language	English
format	Article
sources	DOAJ
author	Marco Altmann Peter Ott Nicolaj C. Stache Christian Waldschmidt
spellingShingle	Marco Altmann Peter Ott Nicolaj C. Stache Christian Waldschmidt Multi-Modal Cross Learning for an FMCW Radar Assisted by Thermal and RGB Cameras to Monitor Gestures and Cooking Processes IEEE Access Machine learning neural networks radar applications multimodal sensors cross learning autoencoder
author_facet	Marco Altmann Peter Ott Nicolaj C. Stache Christian Waldschmidt
author_sort	Marco Altmann
title	Multi-Modal Cross Learning for an FMCW Radar Assisted by Thermal and RGB Cameras to Monitor Gestures and Cooking Processes
title_short	Multi-Modal Cross Learning for an FMCW Radar Assisted by Thermal and RGB Cameras to Monitor Gestures and Cooking Processes
title_full	Multi-Modal Cross Learning for an FMCW Radar Assisted by Thermal and RGB Cameras to Monitor Gestures and Cooking Processes
title_fullStr	Multi-Modal Cross Learning for an FMCW Radar Assisted by Thermal and RGB Cameras to Monitor Gestures and Cooking Processes
title_full_unstemmed	Multi-Modal Cross Learning for an FMCW Radar Assisted by Thermal and RGB Cameras to Monitor Gestures and Cooking Processes
title_sort	multi-modal cross learning for an fmcw radar assisted by thermal and rgb cameras to monitor gestures and cooking processes
publisher	IEEE
series	IEEE Access
issn	2169-3536
publishDate	2021-01-01
description	This paper proposes a multi-modal cross learning approach to augment the neural network training phase by additional sensor data. The approach is multi-modal during training (i.e., radar Range-Doppler maps, thermal camera images, and RGB camera images are used for training). In inference, the approach is single-modal (i.e., only radar Range-Doppler maps are needed for classification). The proposed approach uses a multi-modal autoencoder training which creates a compressed data representation containing correlated features across modalities. The encoder part is then used as a pretrained network for the classification task. The benefits are that expensive sensors like high resolution thermal cameras are not needed in the application but a higher classification accuracy is achieved because of the multi-modal cross learning during training. The autoencoders can also be used to generate hallucinated data of the absent sensors. The hallucinated data can be used for user interfaces, a further classification, or other tasks. The proposed approach is verified within a simultaneous cooking process classification, 2 × 2 cooktop occupancy detection, and gesture recognition task. The main functionality is an overboil protection and gesture control of a 2 × 2 cooktop. The multi-modal cross learning approach considerably outperforms single-modal approaches on that challenging classification task.
topic	Machine learning neural networks radar applications multimodal sensors cross learning autoencoder
url	https://ieeexplore.ieee.org/document/9345685/
work_keys_str_mv	AT marcoaltmann multimodalcrosslearningforanfmcwradarassistedbythermalandrgbcamerastomonitorgesturesandcookingprocesses AT peterott multimodalcrosslearningforanfmcwradarassistedbythermalandrgbcamerastomonitorgesturesandcookingprocesses AT nicolajcstache multimodalcrosslearningforanfmcwradarassistedbythermalandrgbcamerastomonitorgesturesandcookingprocesses AT christianwaldschmidt multimodalcrosslearningforanfmcwradarassistedbythermalandrgbcamerastomonitorgesturesandcookingprocesses
_version_	1724179983806496768

Multi-Modal Cross Learning for an FMCW Radar Assisted by Thermal and RGB Cameras to Monitor Gestures and Cooking Processes

Similar Items