HazelnutPilot Ai
Turns product requirements into live tests
Created on 19th August 2025
•
HazelnutPilot Ai
Turns product requirements into live tests
The problem HazelnutPilot Ai solves
Quality Assurance (QA) testing today is slow, repetitive, and often requires technical knowledge. Non-technical teams struggle to create and run tests, and developers waste time on manual flows. Hazelnut AI automates this: you upload a PRD → AI generates test cases → Playwright executes them → results, issues, and artifacts appear in a clean dashboard. This saves hours of repetitive QA and makes testing accessible to everyone.
Demo instructions
**⚠️ Heads up: the first load may take ~15–45s because the API runs on Render (free tier) and needs to wake up. If it spins, give it a moment and hit Refresh.
Create Project → set Base URL (You can get this from the prd itslef)
Download sample PRD from dashboard → Upload PRD in Project.
Click Run tests → open video, view first-fail screenshot, and download Issues.xlsx.
(If you need links: UI: <your-ui-url>, API: <your-api-url>)
**
Impact / who benefits
QA & SDETs: baseline smoke in minutes, not days.
PM/Eng: reproducible evidence, faster triage.
What I built during the hackathon (accomplishments)
End-to-end PRD→tests pipeline with guardrails
Playwright runner + artifacts + first-fail stop
Dashboard with KPIs, animated pie, run history
Shareable viewer for results
MVP sample PRD + guided notice
Challenges I ran into
Turning AI output into something you can actually run
The model sometimes wrote vague or inconsistent steps. I built a safety net that cleans and checks every step, only allowing a small, predictable set of actions. If something looks off, it gets fixed or rejected before it ever hits the browser.
Clicking the right thing on messy web pages
Real sites have multiple “Login” buttons and tricky dropdowns. I taught the runner to look for human-readable labels and roles (the same cues screen readers use), and to sanity-check with the page URL. Dropdowns got a special “be-smart-about-it” handler.
Cloud browsers can be noisy and flaky
In hosted environments, analytics/ads slow things down and cause random errors. I blocked those third-party scripts, tuned the waits, and made sure we always save a video and screenshot so there’s proof even when something fails.
Good UX under a 48-hour sprint
I cut scope to a straight line: Upload PRD → Run → See results. A sample-PRD banner helps first-time users succeed fast, and the dashboard shows only what matters (pass/fail pie, counts, history) without distractions.
Tracks Applied
🛤️ AI Agents & Workflow Automation (primary)
touches AI Evals & Observability (since we track run history, pass/fail, and bug reports).
Technologies used
