Skip to content
Protocall

Protocall

“From preparation to presence.”

Created on 3rd January 2026

Protocall

Protocall

“From preparation to presence.”

The problem Protocall solves

The Problem Protocall Solves

Interview preparation today is largely one-dimensional. Most platforms focus on what a candidate says, but real interviews evaluate how a candidate communicates , their confidence, clarity, tone, and non-verbal presence. Access to realistic mock interviews with immediate, objective feedback is limited, expensive, or inconsistent, leaving candidates underprepared despite strong technical skills.

How Protocall Makes Interview Preparation Better

Protocall transforms interview practice into a realistic, AI-driven simulation using Google Studio AI, making preparation smarter, safer, and more effective.

🎯 What People Use Protocall For

  • Practicing real-time, voice-to-voice mock interviews
  • Improving communication, confidence, and clarity
  • Preparing for role-specific interviews (Frontend, Backend, Leadership, System Design)
  • Receiving objective, unbiased feedback without human pressure
  • Tracking personal growth across multiple interview sessions

🧠 How It Makes Existing Tasks Easier & Safer

  • Instant feedback instead of delayed or subjective reviews
  • 24/7 availability without scheduling peers or mentors
  • Privacy-first practice with no persistent data storage
  • Reduced anxiety through judgment-free AI coaching
  • Consistent evaluation using standardized AI metrics

🚀 What Makes It Different

Powered by Google Studio AI, Protocall uses multimodal intelligence to:

  • Analyze spoken responses in real time
  • Interpret non-verbal cues like eye contact and posture
  • Adapt interview difficulty dynamically
  • Deliver actionable insights immediately after the session

Impact

Protocall bridges the gap between preparation and performance by turning interview practice into measurable intelligence, helping candidates walk into real interviews with confidence, clarity, and control.

🛠 Tech Stack Used

Frontend

  • React 19 – Leveraging concurrent rendering and modern hooks for a smooth, responsive UI.
  • TypeScript – Strong typing for managing complex AI interaction states, audio streams, and UI logic.
  • Tailwind CSS – Utility-first styling with a custom professional palette and Glassmorphism design.

Artificial Intelligence & AI Orchestration

  • Genkit (Google AI Studio) – Used to orchestrate agent-based reasoning and manage multimodal AI workflows.
  • Gemini 2.5 Flash (Native Audio) – Powers low-latency, real-time voice-to-voice interview conversations.
  • Gemini APIs – Enables live transcription, reasoning, evaluation, and function calling.
  • Function Calling – Allows the AI agent to update visual feedback in real time without interrupting the interview flow.

Multimedia & Web APIs

  • Web Audio API – Real-time audio capture, PCM encoding/decoding, and streaming.
  • MediaDevices API – Camera and microphone access for multimodal interaction.
  • Canvas API – Video frame extraction for visual cue analysis.

#Data Visualization

  • Recharts – Interactive radar and bar charts for post-interview performance analytics.

Infrastructure & Delivery

  • ESM.sh – Zero-install ES module CDN for fast, serverless dependency delivery.
  • HTML5 / CSS3 – Modern web standards for performance and accessibility.

Challenges we ran into

Challenges We Ran Through

Building Protocall required solving several technical and design challenges to deliver a realistic, low-latency, and privacy-first interview experience.

#🔊 Real-Time Audio Latency
Creating natural, voice-to-voice conversations was challenging due to strict latency requirements. We had to carefully manage raw PCM audio streaming and buffering to ensure smooth, uninterrupted dialogue using the Gemini Live API.

🎥 Multimodal Synchronization

Coordinating live audio, video frames, transcription, and AI reasoning in parallel was complex. Ensuring that visual cue analysis aligned accurately with spoken responses required precise timing and frame extraction logic.

#🧠 Agent-Based Reasoning Design
Designing an AI agent that could perceive, reason, and act in real time,without interrupting the interview flow,was a key challenge. This involved implementing function calling so the agent could update UI feedback dynamically while maintaining conversational continuity.

🖥 UI Feedback Without Distraction

Delivering real-time behavioral feedback (confidence, eye contact, posture) without overwhelming or distracting the user required multiple UI iterations. We balanced visibility and subtlety to maintain interview realism.

📊 Meaningful Evaluation Metrics

Translating subjective interview qualities like confidence and clarity into structured, measurable scores and radar charts was non-trivial. We refined scoring weights and feedback logic to ensure fairness and interpretability.

🛡 Privacy & Security Constraints

We deliberately avoided persistent data storage to protect user privacy. This required careful session handling and real-time-only processing, which added complexity to analytics generation.

#⚙️ Browser & Device Constraints
Working with Web Audio, MediaDevices, and Canvas APIs across browsers introduced compatibility and performance challenges, especially for camera and microphone handling.

Despite these challenges, overcoming them allowed us to build a robust, scalable, and immersive AI interview coaching platform that closely mirrors real-world interview dynamics.

Discussion

Builders also viewed

See more projects on Devfolio