Summary: | <p>Recognising and estimating gestures is a fundamental aspect towards translating from a sign language to a spoken language. It is a challenging problem and at the same time, a growing phenomenon in Computer Vision. This thesis presents two approaches, an example-based and a learning-based approach, for performing integrated detection, segmentation and 3D estimation of the human upper body from a single camera view. It investigates whether an upper body pose can be estimated from a database of exemplars with labelled poses. It also investigates whether an upper body pose can be estimated using skin feature extraction, Support Vector Machines (SVM) and a 3D human body model. The example-based and learning-based approaches obtained success rates of 64% and 88%, respectively. An analysis of the two approaches have shown that, although the learning-based system generally performs better than the example-based system, both approaches are suitable to recognise and estimate upper body poses in a South African sign language recognition and translation system.</p>
|