Design and Implementation of a Deep Learning-based Sign Language Recognition Application

  • The rapid growth and advancements in deep learning have captured the attention of tech companies worldwide, prompting them to explore the replacement of traditional solutions with deep learning approaches across various fields. A notable area of focus is Human-Computer Interaction, particularly in the development of Sign Language Recognition systems. This thesis presents the design and implementation of a sign language recognition system using deep learning techniques, incorporating three different model architectures: Convolutional Neural Network (CNN), Gated Recurrent Unit (GRU), and Long Short-Term Memory (LSTM) networks. The objective is to create an isolated sign language recognition application which can be served as a tool for learning and practicing sign language. The system integrates MediaPipe Holistic for precise hand and body keypoint detection with deep learning models for sequence prediction. A user-friendly interface, developed with PyQt5, enables users to capture new sign language samples, train the models, and evaluate their performance. Key functionalities include real-time video frames capture, gesture recognition, and text-to-speech conversion, which provides auditory feedback for the recognized signs. This thesis explores the performance of CNN, GRU, and LSTM models for recognizing six common German sign language phrases. The CNN and GRU models achieved perfect scores, with 100\% accuracy, precision, recall, and F1-scores, indicating high effectiveness in recognizing the sign language phrases. The LSTM model, while slightly less accurate, still performed reliably with an accuracy of 78\% and an F1 score of 0.77. These results demonstrate the potential of the CNN and GRU models for effective isolated sign language interpretation, highlighting their suitability as accessible communication solutions for the deaf community and also for learning purposes.

Export metadata

Metadaten
Author:Hamed Mishian
Document Type:Master's Thesis
Language:English
Date of Publication (online):2025/01/28
Year of first Publication:2025
Page Number:62
Faculty:Westsächsische Hochschule Zwickau / Physikalische Technik, Informatik
Release Date:2025/02/20