Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case

Human-machine interaction could support many daily activities in making it more convenient. The development of smart devices has flourished the underlying smart systems that process smart and personalized control of devices. The first step in controlling any device is observation; through understand...

Full description

Bibliographic Details
Main Author: Felemban, Abdulwahab A.
Other Authors: Al-Naffouri, Tareq Y.
Language:en
Published: 2021
Subjects:
Online Access:Felemban, A. A. (2021). Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case. KAUST Research Repository. https://doi.org/10.25781/KAUST-0G21G
http://hdl.handle.net/10754/673882
id ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-673882
record_format oai_dc
spelling ndltd-kaust.edu.sa-oai-repository.kaust.edu.sa-10754-6738822021-12-04T05:07:46Z Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case Felemban, Abdulwahab A. Al-Naffouri, Tareq Y. Computer, Electrical and Mathematical Science and Engineering (CEMSE) Division Ghanem, Bernard Elhoseiny, Mohamed H. Bader, Ahmed Masood, Mudassir Smart-Tap Wudu 3D Skeleton Human Activity Recognition Action Anticipation Real-time Human-machine interaction could support many daily activities in making it more convenient. The development of smart devices has flourished the underlying smart systems that process smart and personalized control of devices. The first step in controlling any device is observation; through understanding the surrounding environment and human activity, a smart system can physically control a device. Human activity recognition (HAR) is essential in many smart applications such as self-driving cars, human-robot interaction, and automatic systems such as infrared (IR) taps. For human-centric systems, there are some requirements to perform a physical task in real-time. For human-machine interactions, the anticipation of human actions is essential. IR taps have delay limitations because of the proximity sensor that signals the solenoid valve only when the user’s hands are exactly below the tap. The hardware and electronics delay causes inconvenience in use and water waste. In this thesis, an alternative control based on deep learning action anticipation is proposed. Humans interact with taps for various tasks such as washing hands, face, brushing teeth, just to name a few. We focus on a small subset of these activities. Specifically, we focus on the activities carried out sequentially during an Islamic cleansing ritual called Wudu. Skeleton modality is widely used in HAR because of having abstract information that is scale-invariant and robust against imagery variances. We used depth cameras to obtain accurate 3D human skeletons of users performing Wudu. The sequences were manually annotated with ten atomic action classes. This thesis investigated the use of different Deep Learning networks with architectures optimized for real-time action anticipation. The proposed methods were mainly based on the Spatial-Temporal Graph Convolutional Network. With further improvements, we proposed a Gated Recurrent Unit (GRU) model with Spatial-Temporal Graph Convolution Network (ST-GCN) backbone to extract local temporal features. The GRU process the local temporal latent features sequentially to predict future actions. The proposed models scored 94.14% recall on binary classification to turn on and off the water tap. And higher than 81.58-89.08% recall on multiclass classification. 2021-12-02T09:07:07Z 2021-12-02T09:07:07Z 2021-12 Thesis Felemban, A. A. (2021). Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case. KAUST Research Repository. https://doi.org/10.25781/KAUST-0G21G 10.25781/KAUST-0G21G http://hdl.handle.net/10754/673882 en 2022-11-30 At the time of archiving, the student author of this thesis opted to temporarily restrict access to it. The full text of this thesis will become available to the public after the expiration of the embargo on 2022-11-30.
collection NDLTD
language en
sources NDLTD
topic Smart-Tap
Wudu
3D Skeleton
Human Activity Recognition
Action Anticipation
Real-time
spellingShingle Smart-Tap
Wudu
3D Skeleton
Human Activity Recognition
Action Anticipation
Real-time
Felemban, Abdulwahab A.
Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case
description Human-machine interaction could support many daily activities in making it more convenient. The development of smart devices has flourished the underlying smart systems that process smart and personalized control of devices. The first step in controlling any device is observation; through understanding the surrounding environment and human activity, a smart system can physically control a device. Human activity recognition (HAR) is essential in many smart applications such as self-driving cars, human-robot interaction, and automatic systems such as infrared (IR) taps. For human-centric systems, there are some requirements to perform a physical task in real-time. For human-machine interactions, the anticipation of human actions is essential. IR taps have delay limitations because of the proximity sensor that signals the solenoid valve only when the user’s hands are exactly below the tap. The hardware and electronics delay causes inconvenience in use and water waste. In this thesis, an alternative control based on deep learning action anticipation is proposed. Humans interact with taps for various tasks such as washing hands, face, brushing teeth, just to name a few. We focus on a small subset of these activities. Specifically, we focus on the activities carried out sequentially during an Islamic cleansing ritual called Wudu. Skeleton modality is widely used in HAR because of having abstract information that is scale-invariant and robust against imagery variances. We used depth cameras to obtain accurate 3D human skeletons of users performing Wudu. The sequences were manually annotated with ten atomic action classes. This thesis investigated the use of different Deep Learning networks with architectures optimized for real-time action anticipation. The proposed methods were mainly based on the Spatial-Temporal Graph Convolutional Network. With further improvements, we proposed a Gated Recurrent Unit (GRU) model with Spatial-Temporal Graph Convolution Network (ST-GCN) backbone to extract local temporal features. The GRU process the local temporal latent features sequentially to predict future actions. The proposed models scored 94.14% recall on binary classification to turn on and off the water tap. And higher than 81.58-89.08% recall on multiclass classification.
author2 Al-Naffouri, Tareq Y.
author_facet Al-Naffouri, Tareq Y.
Felemban, Abdulwahab A.
author Felemban, Abdulwahab A.
author_sort Felemban, Abdulwahab A.
title Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case
title_short Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case
title_full Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case
title_fullStr Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case
title_full_unstemmed Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case
title_sort deep learning action anticipation for real-time control of water valves: wudu use case
publishDate 2021
url Felemban, A. A. (2021). Deep Learning Action Anticipation for Real-time Control of Water Valves: Wudu use case. KAUST Research Repository. https://doi.org/10.25781/KAUST-0G21G
http://hdl.handle.net/10754/673882
work_keys_str_mv AT felembanabdulwahaba deeplearningactionanticipationforrealtimecontrolofwatervalveswuduusecase
_version_ 1723963725796343808