EchoEyes

Where Echoes Guide The eyes

Created on 27th July 2025

•

EchoEyes

Where Echoes Guide The eyes

The problem EchoEyes solves

Over 39 million people globally are blind, and over 246 million have low vision. Navigating daily life without clear visual input makes tasks like identifying obstacles, reading signs, or moving through unfamiliar spaces extremely challenging. Traditional aids like white canes detect immediate obstacles but cannot provide contextual awareness or guidance. There is a critical need for an intelligent, affordable solution that enhances independent mobility and situational understanding for the visually impaired.
EchoEyes is a wearable AI-powered smart assistant designed to enhance the mobility and independence of visually impaired individuals. Combining obstacle detection, object recognition, and voice feedback, the system offers real-time environmental awareness through sound.

The device senses nearby obstacles using ultrasonic sensor and alerts the user through a buzzer or voice prompts. A camera module captures visual information, which is processed using AI models to recognize objects, text, or signs. The results are then converted to speech using a text-to-speech engine, enabling the user to hear what’s around them in real time.

While this prototype currently runs on a laptop, the same AI code can be seamlessly ported to a mobile app or a Raspberry Pi–based embedded system for real-world deployment. In a full-fledged product, the user would simply wear a small smart camera, such as one mounted on glasses connected to a pocket-sized processor or their smartphone, making the device compact, wearable, and suitable for daily use without relying on bulky equipment.

By transforming visual cues into sound, EchoEyes empowers the visually impaired to explore their surroundings with greater confidence, independence, and dignity.

Challenges we ran into

One of the most critical challenges faced during the development of EchoEyes is the computational intensity of the system, especially when running on devices without dedicated GPU support. The use of the YOLOv8s object detection model, while accurate and reliable, demands significant processing power. When executed on CPU-only laptops or embedded devices like Raspberry Pi, the model tends to consume high CPU resources, resulting in:

Noticeable lag between voice commands and system response

Low frame rates in the live video feed

Frequent system slowdowns or unexpected crashes, particularly under sustained usage

These performance issues are further exacerbated when combined with other real-time tasks such as text-to-speech (TTS) synthesis and speech recognition, both of which also require significant system resources. The simultaneous execution of these tasks leads to system instability, especially on constrained hardware.The prototype currently employs a workaround that uses Google Text-to-Speech (gTTS) in combination with pygame to generate and play voice feedback. While this approach is simple and works on most desktops, it introduces several limitations that hinder performance and reliability:
Internet Dependency: gTTS relies on an active internet connection to convert text to speech. In real-world applications, especially for assistive devices designed to work in offline or outdoor environments, this dependency is a significant drawback.

Technologies used

PyTorch

Pygame

Arduino IDE

C++

Python

Ultrasonic

gTTS

Discussion

Builders also viewed

See more projects on Devfolio