zkML-GOODBLEEP

Agent swarm music gen comp proof arbitrum insanity

Built at ZK Hack Berlin

Arbitrum: Best Project

Created on 22nd June 2025

•

zkML-GOODBLEEP

Agent swarm music gen comp proof arbitrum insanity

The problem zkML-GOODBLEEP solves

Nominated for useless product-market fit
Likely makes the world more dangerous.

A swarm of agents (chatgpt-4o-mini LLMs who think they are genetic algorithms) compete to generate the BEST music (as measured by Meta's Audiobox Aesthetics neural net) & win a prize on arbitrum (validated by zkML proof which gives up an ETH prize for any music which maximizes the aesthics)

The hard parts - seeing my dream crumbling before me because of ZKLM limits--

ZKML proofs needed to be scaled WAY back -- 1 billion parameter models nope!! Instead LLMs write short lines of music generating code
Audiobox-Aesthetics also needed to be scaled WAY back to 20kB distilled model
music length scaled WAY back
ezkl circuit logrows scaled WAY back

The workflow---

(1) dozens of openai API chatgpt-40-mini agents evolve music (new music every 10 seconds) in competition
(2) the music is aesthetically scored by another AI
(3) to win, an agent privately generates a proof of having a song with max score
(4) the proof is validated on a smart contract on arbitrum here https://arbiscan.io/address/0xd2844f6d1030453b7690e01be31498ea8ac6c0ec
(5) first song proof w/ max score wins ETH

Agents keep the songs secret from each other the whole time!!

Challenges I ran into

I had ambitious hopes of putting a pipeline of over 1B billion parameters of neural networks into a zkML prover. Turns out I had to scale WAY back!!! I was in both ezkl and zkVM dependency hell, simultaneously reading docs & questioning LLMs to solve my issues. ChatGPT first suggested I trace my models into torchscript. But then I'd have to load libtorch in rust for zkVM and that is IMPRACTICAL. ONNX seems the way. The ezkl docs were nice, although way too sparse, and were using a syntax from a different version from the latest I was running. I had to figure it out painfully. OK so I realized I still need to scale back. I can't do a full music generating DiT, but I could just use very small bytebeat/demoscene style music generation code. The goal is to use, at the end of the net pipeline, a model that does aesthetic scoring (returns a number of how "good"), Meta's Audiobox Aesthetics. Ok but it's 100M params. STILL TOO BIG. Also ONNX is such a pain the ass, Microsoft doesn't support all the opcodes in the runtime the way it should. So I scaled it back more. 10s inputs of audio. Also distilled the model down into 100K param transformer model. I started getting paranoid about time so I distlled it further to 20K params. ezkl setup was STILL getting killed. AHHH. I was downsizing params to 12 logrows, then 5, 2 was too far. OK eventually I got a prover.

Tracks Applied (2)

ZK Hack Berlin Winners

I used ezkl to do zkML

Best ZK app with No Product-Market Fit

Potentially dangerous to life itself

Arbitrum

Technologies used

Python

Discussion

Builders also viewed

See more projects on Devfolio