Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case
Human-machine interaction could support many daily activities in making it more convenient. The development of smart devices has flourished the underlying smart systems that process smart and personalized control of devices. The first step in controlling any device is observation; through understand...
Main Author: | |
---|---|
Other Authors: | |
Language: | en |
Published: |
2021
|
Subjects: | |
Online Access: | Felemban, A. A. (2021). Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case. KAUST Research Repository. https://doi.org/10.25781/KAUST-0G21G http://hdl.handle.net/10754/673882 |
id |
ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-673882 |
---|---|
record_format |
oai_dc |
spelling |
ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-6738822021-12-04T05:07:46Z Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case Felemban, Abdulwahab A. Al-Naffouri, Tareq Y. Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division Ghanem, Bernard Elhoseiny, Mohamed H. Bader, Ahmed Masood, Mudassir Smart-Tap Wudu 3D Skeleton Human Activity Recognition Action Anticipation Real-time Human-machine interaction could support many daily activities in making it more convenient. The development of smart devices has flourished the underlying smart systems that process smart and personalized control of devices. The first step in controlling any device is observation; through understanding the surrounding environment and human activity, a smart system can physically control a device. Human activity recognition (HAR) is essential in many smart applications such as self-driving cars, human-robot interaction, and automatic systems such as infrared (IR) taps. For human-centric systems, there are some requirements to perform a physical task in real-time. For human-machine interactions, the anticipation of human actions is essential. IR taps have delay limitations because of the proximity sensor that signals the solenoid valve only when the user’s hands are exactly below the tap. The hardware and electronics delay causes inconvenience in use and water waste. In this thesis, an alternative control based on deep learning action anticipation is proposed. Humans interact with taps for various tasks such as washing hands, face, brushing teeth, just to name a few. We focus on a small subset of these activities. Specifically, we focus on the activities carried out sequentially during an Islamic cleansing ritual called Wudu. Skeleton modality is widely used in HAR because of having abstract information that is scale-invariant and robust against imagery variances. We used depth cameras to obtain accurate 3D human skeletons of users performing Wudu. The sequences were manually annotated with ten atomic action classes. This thesis investigated the use of different Deep Learning networks with architectures optimized for real-time action anticipation. The proposed methods were mainly based on the Spatial-Temporal Graph Convolutional Network. With further improvements, we proposed a Gated Recurrent Unit (GRU) model with Spatial-Temporal Graph Convolution Network (ST-GCN) backbone to extract local temporal features. The GRU process the local temporal latent features sequentially to predict future actions. The proposed models scored 94.14% recall on binary classification to turn on and off the water tap. And higher than 81.58-89.08% recall on multiclass classification. 2021-12-02T09:07:07Z 2021-12-02T09:07:07Z 2021-12 Thesis Felemban, A. A. (2021). Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case. KAUST Research Repository. https://doi.org/10.25781/KAUST-0G21G 10.25781/KAUST-0G21G http://hdl.handle.net/10754/673882 en 2022-11-30 At the time of archiving, the student author of this thesis opted to temporarily restrict access to it. The full text of this thesis will become available to the public after the expiration of the embargo on 2022-11-30. |
collection |
NDLTD |
language |
en |
sources |
NDLTD |
topic |
Smart-Tap Wudu 3D Skeleton Human Activity Recognition Action Anticipation Real-time |
spellingShingle |
Smart-Tap Wudu 3D Skeleton Human Activity Recognition Action Anticipation Real-time Felemban, Abdulwahab A. Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case |
description |
Human-machine interaction could support many daily activities in making it more convenient. The development of smart devices has flourished the underlying smart systems that process smart and personalized control of devices. The first step in controlling any device is observation; through understanding the surrounding environment and human activity, a smart system can physically control a device. Human activity recognition (HAR) is essential in many smart applications such as self-driving cars, human-robot interaction, and automatic systems such as infrared (IR) taps. For human-centric systems, there are some requirements to perform a physical task in real-time. For human-machine interactions, the anticipation of human actions is essential. IR taps have delay limitations because of the proximity sensor that signals the solenoid valve only when the user’s hands are exactly below the tap. The hardware and electronics delay causes inconvenience in use and water waste. In this thesis, an alternative control based on deep learning action anticipation is proposed. Humans interact with taps for various tasks such as washing hands, face, brushing teeth, just to name a few. We focus on a small subset of these activities. Specifically, we focus on the activities carried out sequentially during an Islamic cleansing ritual called Wudu. Skeleton modality is widely used in HAR because of having abstract information that is scale-invariant and robust against imagery variances. We used depth cameras to obtain accurate 3D human skeletons of users performing Wudu. The sequences were manually annotated with ten atomic action classes. This thesis investigated the use of different Deep Learning networks with architectures optimized for real-time action anticipation. The proposed methods were mainly based on the Spatial-Temporal Graph Convolutional Network. With further improvements, we proposed a Gated Recurrent Unit (GRU) model with Spatial-Temporal Graph Convolution Network (ST-GCN) backbone to extract local temporal features. The GRU process the local temporal latent features sequentially to predict future actions. The proposed models scored 94.14% recall on binary classification to turn on and off the water tap. And higher than 81.58-89.08% recall on multiclass classification. |
author2 |
Al-Naffouri, Tareq Y. |
author_facet |
Al-Naffouri, Tareq Y. Felemban, Abdulwahab A. |
author |
Felemban, Abdulwahab A. |
author_sort |
Felemban, Abdulwahab A. |
title |
Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case |
title_short |
Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case |
title_full |
Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case |
title_fullStr |
Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case |
title_full_unstemmed |
Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case |
title_sort |
deep learning action anticipation for real-time control of water valves: wudu use case |
publishDate |
2021 |
url |
Felemban, A. A. (2021). Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case. KAUST Research Repository. https://doi.org/10.25781/KAUST-0G21G http://hdl.handle.net/10754/673882 |
work_keys_str_mv |
AT felembanabdulwahaba deeplearningactionanticipationforrealtimecontrolofwatervalveswuduusecase |
_version_ |
1723963725796343808 |