Created on 8th November 2024
•
Communication challenges affect many young children, hindering their ability to express themselves confidently and engage socially. Often, these challenges stem from speech, language, or voice disorders that begin in early childhood. For children ages 3 to 7, speech difficulties are particularly prevalent, with roughly 8% displaying noticeable issues, such as articulation disorders or stuttering, as they reach primary school age. Many of these cases have no known cause, yet the impact on a child’s self-esteem and social engagement can be significant.
Traditional speech therapy has proven beneficial, yet its accessibility is often limited by high costs, geographic constraints, and the inability to offer personalized, real-time guidance. As a result, many children are left without consistent, tailored support to address their unique needs.
The Challenge Code Blooded is Addressing:
Children struggling with specific sounds and phonemes often find it difficult to gain timely and relevant feedback through standard therapy approaches, which may lack personalization. Moreover, the high cost and limited accessibility of traditional methods can prevent children from receiving adequate practice and support.
How ArticulateIQ Tackles the Problem:
ArticulateIQ harnesses the power of AI to analyze pronunciation through an initial assessment of letters and words, identifying sounds that present particular challenges. From this assessment, the app creates customized practice sessions with immediate feedback, allowing children to refine their pronunciation in real-time.
Developing our application with a diverse tech stack posed several intricate challenges, each of which we tackled with tailored solutions. Handling speech-to-text conversion using PyAudio alongside the Whisper model required significant effort, particularly to manage the variations in transcribed text. Rather than fine-tuning Whisper, we prioritized backend optimizations to achieve accurate text comparison. This included integrating natural language processing techniques to boost text analysis precision, resulting in strong, reliable performance.Due to the large size of Blender video files, storing these on GitHub was unfeasible. To address this, we migrated our video assets to AWS S3, embedding links directly into our documentation. This approach allowed efficient handling of large assets without impacting application performance, and it also provided enhanced flexibility for video manipulation and updates.Configuring the Gen AI model for voice tasks, along with designing effective prompts, was a complex undertaking. Ensuring the model accurately processed voice inputs and generated relevant responses was essential. Integrating data for report generation added another layer of complexity, necessitating precise data workflows to ensure output accuracy. With carefully crafted prompt engineering and optimized data integration, we were able to overcome these challenges.Overall, these solutions strengthened our application, making it both scalable and robust, and underscoring our capability to deliver an adaptable, high-quality product.
Tracks Applied (5)
Major League Hacking
Major League Hacking
Major League Hacking
GitHub Education
Google For Developers