Speaker.ai isn't just an educational public speaking app, it's a way to find information for you. Speaker.ai brings you from start to finish in your learning journey, providing feedback and analytics as you upload videos to improve all aspects of your public speaking skills.
Get started on the Landing Page and sign up for a free account, with user information encrypted by JWT.
View and interact with the dashboard or press upload to provide a new video for analysis.
Once uploaded, the video is compressed framewise and bitwise by ffmpeg and transcribed by using Google Cloud's Speech-to-Text.
Then with API calls to VertexAI Vision and Gemini in conjunction with LangChain for concurrency optimization and MongoDB for data storage, highly researched and engineered prompts obtain key points, feedback, and personalized course outlines.
This data is fed into the YouTube API to find the most suitable educational video.
Users can view their results, and past recordings and edit previously created lesson plans to optimize their studying curriculum on their dashboard.
Speaker.ai reduces the time it takes to find something that's relevant to each person and eliminates the barrier of entry in learning new skills efficiently.
Our journey in creating a scalable and efficient application was not without our trials and tribulations. With the large amount of API calls made to VertexAI and Gemini, hallucinations had to be minimized through careful prompt engineering and relentless testing. Additionally, utilizing the variety of vital frameworks and libraries proved to be a challenge in connecting and testing modules.
Tracks Applied (2)
Farcaster Builders India
Stackr Labs
Discussion