Vision Voice

Enhancing Accessibility through AI and Voice Technology

Created on 29th June 2024

•

Vision Voice

Enhancing Accessibility through AI and Voice Technology

The problem Vision Voice solves

Challenge: Over 285 million people worldwide are visually impaired.
Need: Many videos lack descriptive audio, making it difficult for the visually impaired to understand visual content.
Impact: Limited access to educational, entertainment, and informative content.

Challenges we ran into

Understanding YouTube's API:

The YouTube API documentation is extensive, and it took a lot of time to understand how to properly use it. Managing API rate limits was another headache, constantly hitting the limits and having to wait to continue testing.
The whole authentication and authorization process was cumbersome, requiring frequent tweaks and careful handling to ensure tokens were managed correctly.

Navigating YouTube's DOM Structure:

YouTube's DOM structure changes frequently, making it a moving target. Just when I thought I had figured it out, an update would come along and break everything.
Selecting and manipulating the right elements was a delicate task, and one wrong move could disrupt the entire page.

Handling Asynchronous Data Loading:

YouTube loads a lot of content asynchronously, which meant that my extension needed to be highly responsive to changes. MutationObservers were helpful but added complexity.
Ensuring that my code executed at the right time and place was a constant battle, especially with the dynamic nature of YouTube's content.

Technologies used

JavaScript

Flask

Python

Chrome Extension

Discussion

Builders also viewed

See more projects on Devfolio