GuardianAI: Real-time Firewall for LLMs
GuardianAI: Real-time Firewall for LLMs
Created on 11th February 2026
•
GuardianAI: Real-time Firewall for LLMs
GuardianAI: Real-time Firewall for LLMs
The problem GuardianAI: Real-time Firewall for LLMs solves
Companies are rushing to integrate Large Language Models (LLMs) into their products, but they are ignoring a critical risk: AI models are insecure by default. Traditional firewalls cannot understand natural language, leaving AI applications wide open to a new generation of cyber threats.
We identified three massive problems that current tools don't solve:
1. Data Leakage (DLP): Employees accidentally paste sensitive data (like API keys, PII, or proprietary code) into chatbots. A famous example is Samsung engineers leaking source code to ChatGPT.
2. Prompt Injection: Hackers can "jailbreak" AI models using "DAN" prompts or hidden instructions to bypass safety filters and steal system data.
3. Rogue Agents: As AI agents get access to tools (like databases), they can be tricked into executing dangerous commands (like DROP TABLE users) if not properly monitored.
The Gap: While there are thousands of tools to build AI, there is almost no infrastructure to secure it. Guardian AI fills this gap by acting as a real-time firewall for the LLM era
Challenges we ran into
1. The "Latency vs. Security" Trade-off
The biggest hurdle was adding a security layer without making the chat feel slow. Initially, our PII scanner added a 2-second delay to every message, which ruined the user experience.
How we fixed it: We optimized our regex engine and switched to asynchronous processing in FastAPI (async def). We also implemented a "Fail-Open" logic for non-critical checks, ensuring the user never experiences more than 200ms of added latency.
2. Handling "Indirect" Prompt Injections (RAG)
Detecting malicious commands hidden inside uploaded files (like PDFs) was incredibly difficult because standard text extraction would "activate" the command.
How we fixed it: We built a sandboxed text extraction pipeline that treats all file content as "untrusted data" and scans it for high-risk patterns (like SYSTEM INSTRUCTION:) before it ever gets vectorized or stored in the database.
3. False Positives in Data Loss Prevention (DLP)
Our initial DLP scanner was too aggressive—it would flag standard database IDs as "Credit Card Numbers" and block them.
How we fixed it: We implemented context-aware validation (Luhn algorithm for credit cards) and added a "Trust Score" system. Instead of blocking everything, we now use a weighted risk score (0-100) to make smarter decisions.
Technologies used