GestureTalk "Turning gestures into conversations"

GestureTalk is an AI tool that translates Sign Language into text, speech, and multiple languages in real time using gesture, facial recognition, and adaptive learning with high accuracy.

Built at HackTropica 2k25

Created on 5th April 2025

•

GestureTalk "Turning gestures into conversations"

GestureTalk is an AI tool that translates Sign Language into text, speech, and multiple languages in real time using gesture, facial recognition, and adaptive learning with high accuracy.

The problem GestureTalk "Turning gestures into conversations" solves

The Problem
Communication is a fundamental human right, yet millions of deaf and hard-of-hearing individuals in India face daily challenges due to the limited awareness and use of sign language among the general population.

This communication gap leads to:

Social isolation

Barriers in education

Limited access to healthcare

Fewer employment opportunities

Our Solution
We introduce an intelligent, real-time sign language translation system powered by computer vision and machine learning. Key features include:

Multilingual support (English, Hindi, Spanish, French, and more)

Facial expression detection to enhance emotional context

Bidirectional communication between signers and non-signers

This system enables seamless, inclusive, and accessible communication, empowering the deaf and mute community with independence and equal opportunity

Challenges we ran into

Our solution leverages cutting-edge technologies to create an intelligent, real-time Sign Language interpretation system:
Computer Vision-Based Gesture Recognition:
We implemented advanced computer vision techniques to accurately detect and interpret hand gestures, including complex signs such as “J,” “Z,” and numbers 0–9. Facial expression recognition is also integrated to capture emotional context, improving communication clarity.

Machine Learning for High Accuracy
Our gesture recognition model was trained on an extensive dataset using techniques like Random Forest, achieving a high accuracy of 99.58%. The system is equipped with adaptive learning to refine its accuracy over time based on user feedback.

Real-Time Gesture Translation
Sign gestures are translated into both text and speech in real time using the pyttsx3 text-to-speech engine. The output is multilingual, supporting translation into various spoken languages for broader accessibility.

User-Friendly Interface
The application features a clean, intuitive UI with visual feedback, ensuring ease of use for both deaf and hearing users.

Enhanced Usability & Smart Assistance
Key features include:

Backspace functionality for correcting incorrect predictions

Intelligent word suggestions for smoother and faster communication

Cross-Platform Compatibility
Designed for smartphones, tablets, and computers, the system supports multiple sign and spoken languages, making it globally adaptable and accessible.

System Flowchart

Data Collection: Gesture images are captured and labeled with left/right-hand variations.
Preprocessing: Extracts hand landmarks, normalizes data, and computes additional features like wrist angles.
Model Training: Random Forest Classifier is trained on preprocessed data.
Real-Time Prediction: Webcam frames are processed, and the system predicts gestures live.
Speech & Translation: detected gesture is converted into text and speech then it translate into selected language.

Technologies used

Flask

TensorFlow

Keras

scikit-learn

OpenCV

Python

MTCNN

pyttsx3

Mediapipe

Discussion

Builders also viewed

See more projects on Devfolio