Arbiter
Beyond blind trust: Cross-examine every commitment
Created on 25th January 2026
•
Arbiter
Beyond blind trust: Cross-examine every commitment
The problem Arbiter solves
In global commerce, the primary obstacle isn't a lack of data, but the presence of contradictory and adversarial information. Current procurement and planning decisions rely heavily on supplier-provided delivery guarantees and static historical averages—sources that are often structurally biased toward optimism.
ARBITER solves the "Optimism Bias" in supply chain commitments by replacing blind trust with Adversarial Consensus.
Core Value Proposition
- Challenges Unreliable Commitments: Supplier on-time delivery performance often deviates from promised levels by 20–40%. ARBITER cross-examines these claims in real-time.
- Synthesizes Conflicting Signals: Instead of averaging data and suppressing critical minority risks—like a localized port strike or an emerging storm surge—ARBITER preserves and analyzes these contradictions.
- Provides Probabilistic Transparency: It converts conflicting agent outputs into a Risk-Adjusted Confidence Score, allowing decision-makers to act on probability rather than hope.
- Facilitates Proactive Contingency: By identifying the specific root cause of disagreement (e.g., seasonal port congestion), ARBITER empowers firms to deploy pre-planned Plan B strategies before a failure occurs.
B2B Applications & Operational Impact
As a B2B-first application, ARBITER integrates directly into the executive decision-making loop, making existing logistics tasks significantly safer and more efficient:
- Procurement & Sourcing: Managers can use the platform to vet new suppliers by running "Adversarial Stress Tests" against their promised transit times.
- Inventory Management: CFOs can better plan buffer stocks based on probabilistic confidence levels rather than binary "On-Time/Delayed" estimates.
- Logistics Monitoring: Operations teams can monitor high-value shipments using real-time telemetry from the Logistics and Weather agents to identify "Black Swan" events as they develop.
Multi-Domain Scalability
While demonstrated here for maritime logistics, the adversarial multi-agent logic is a universal framework applicable to any high-stakes domain where data is volatile:
- AgriTech: Challenging crop yield promises by debating soil sensor data against climate forecasts and market demand volatility.
- FinTech: Cross-examining loan risk by pitting internal credit audits against macro-economic sentiment and real-time market fluctuations.
- HealthTech: Resolving diagnostic contradictions by cross-referencing patient symptoms with clinical lab results and drug interaction databases.
The Technical Foundation
- Orchestration: Built on the Google Agent Development Kit (ADK) to manage the sequential "Round Table" logic and shared session state.
- Reasoning Engine: Powered by Gemini 2.0 Flash, enabling high-speed processing of structured metrics and unstructured web intelligence.
- Fact Grounding: Real-time data ingestion via SerpApi (Google Search) and Open-Meteo (Weather Telemetry) ensures every agent assertion is backed by a verifiable source.
Challenges I ran into
The primary hurdle was mastering the Google Agent Development Kit (ADK) for the first time. Managing the shared session.state was critical to ensure the Extraction Specialist could pass structured data to downstream agents without losing context. I overcame this by implementing a SequentialAgent wrapper that effectively turned each agent’s output into a "growing file" of evidence for the next.
Achieving a genuine "Round Table" conversation feel required deep prompt engineering to preventcontext drift. I utilized ADK’s built-in session management to maintain a continuous conversational memory. This allowed the Skeptic Agent to specifically identify "Optimism Bias" in the data provided by previous agents, transforming a robotic sequence into a live adversarial debate.
The final challenge was developing a Next.js UI that visually represented this backend complexity. To provide a real-time experience, I used Server-Sent Events (SSE) to stream agent reasoning character-by-character. I then utilized Framer Motion and trigonometry to build a circular "Adversarial Arena" where visual tracers and grounded citations from SerpApi gave users the feeling of watching a live, data-backed discussion.
Technologies used
