InfoFi

InfoFi

Price discovery of information, such as ML datasets. An intersection of DeFi, AI, and Information / Data

Created on 1st March 2025

InfoFi

InfoFi

Price discovery of information, such as ML datasets. An intersection of DeFi, AI, and Information / Data

The problem InfoFi solves

Data Vendors:

Data vendors can use this service as it allows them to upload their data to a private database while also earning payments from customers that use their dataset.

Customers:

Customers who are in need of data for training their AI models or in need of an AI model can use our services to train a model as well as provision data that ML engineers are always eager to get their hands on.
Customers can provide a testing dataset as well as a desired accuracy threshold to run the trained model against to see if it suits their use case BEFORE paying for it.

Truth and Privacy:

We provide ZK Proofs that the accuracy was really met and that we truly did run inference on the testing dataset.
We then verify these proofs on ZKVerify, and submit the proofs to Hedera Consensus Service.
This is effectively a receipt for customers to show that they really did train their model on a dataset from us. A certificate of authenticity!

Price Discovery:

Like many P2P networks, we incentivize customers to give a chance to the new peer in the network by starting training costs at an EGREGIOUSLY LOW price. Afterwards, as datasets meet the accuracy threshold, their price increases according to the formula of our bounding curve in the factory smart contract. This leads to:

  1. Organic price discovery,
  2. Does not choke out new peers in the network
  3. Protects against sybil by discouraging useless purchases of the dataset as it can get quite expensive

User Interaction and Data Flow

Data Vendors

Data vendors upload their dataset to the chain of their choice
They are the admin of the bounding curve of the dataset, and can withdraw their earnings whenever they wish

Customers

Customers specify how they would like their model to be trained. Currently we support Decision Tree classifiers and the following parameters:

  • Max Depth
  • Min Sample Leafs
  • Min Samples Split

Customers also specify the accuracy they are willing to pay for on their uploaded test set.

They are then returned the encoded model, a ZK Proof of the inference, the attestation of the verified proof on ZKVerify (as well as the information to verify this on chain), and the transaction hash corresponding to the attestation on the Hedera Consesus Service.

The project architecture and development process

We wanted to create something that encouraged price discovery and ML. Decentralized compute services already exist, and AI model services also exist. We originally wanted to implement influence functions after reading a paper by Anthropic and determine how to split up a vault given to us by a customer over multiple datasets that we used for training, but we did not have the GPUs to do this, nor a fair idea on how to determine what should be rewarded, as just changing a parameter more than a different dataset does not mean you helped towards the goal more.

We settled for handling 1 dataset at a time, and therefore decided that we should give consumers a way to sort these datasets by usefulness. Due to sybil concerns, we eventually settled at the idea of payment, and that naturally led to price discovery.

Considering the interests of the data vendors, we wanted to keep the datasets completely private, and instead offer a ZK Proof that the trained model really did meet or exceed some accuracy threshold decided by the user.

Key implementation details are mostly RISC0's proof of ML inference as well as verification of these proofs by ZKVerify

Returning back to the DeFi side, we chose a rather simple bonding curve, but know this can be fine tuned as much as a protocol wants. Feel free to experiment!
One thing we did give much thought was how much to make the initial fee. From our studies of BitTorrent (a P2P file sharing service), we learned that it is necessary to give a new peer in the network an advantage over existing nodes (to boost their chance of being "picked"), and therefore settled for a very low price.

Product Integrations

We used countless RPCs to facilitate the communication between our backend and smart contracts
We used MetaMask as the wallet for the frontend to let the user easily access the datasets that were deployed to that chain
We used ZKVerifyJS to prove our ZKProof on chain
We used Hedera Consensus Service to post our proofs as immutable and timestamped logs
We used RISC0 to create a ZK proof

Key differentiators and uniqueness of the project

Our project's key features are price discovery, compute and ZK proofs to keep our data vendors' data private.

  1. Most marketplaces for goods these days allow vendors to set prices, but we have price discovery
  2. Compute is a highly sought after commodity!
  3. Keeping vendors' data private is paramount to the longevity of a business - vendor relationship. Otherwise, the vendor will grow frustrated with customers leaking the data after a one time purchase. With ZK Proofs, the customers can trust that they got what they paid for (a model that beats an accuracy of their choosing on a test dataset) and vendors can know their data is safe

Trade-offs and shortcuts while building

Limitations of Technology

We are unable to do a ZK proof of the training of the model due to the costliness in terms of time and compute of ZK proofs. We look forward to the new developments in cryptography and both their applications to ML, and their verifiability on chain.

ML Specific Trade Offs

Only support Decistion Trees, but we plan to support more model in the future.

One other thing we wanted to do was to support the use of multiple datasets, and then use influence functions to determine how much we should pay each dataset (as reward for their contribution to model weights).

DeFi Specific Trade Offs

Could not spend too much time on configuring the bounding curve so we settled for a simple one. We plan on tuning this to make it more fair to vendors and customers.

We also currently sponsor all transactions, which is an unsustainable business model, and plan on making the user directly pay us for services.

Tracks Applied (5)

ZK is the Endgame

Our project fits into the ZK is the Endgame Track for many reasons. We use a RISC0 proof to verify inference of an AI mo...Read More

zkSync ∎

Hedera AI and Agents Challenge

Our project InfoFi is a great fit for this Hedera track. InfoFi allows users to train AI models (currently Decision Tree...Read More

Hedera

Hedera Explorers: Imagine, Create, Hello Future

InfoFi is an exploratory idea so naturally it fits in well here. It allows users to train AI models (currently Decision ...Read More

Hedera

ZK Proofs with New Technology - Fast and Inexpensive

Our project fits into the ZK Proofs with New Technology for many reasons. We use a RISC0 proof to verify inference of an...Read More

zkVerify Foundation

DEFI, NFTS + GAMING

Our project fits into the DeFi, NFTs + Gaming Track by creating a data marketplace: an intersection between finance and ...Read More

Discussion

Builders also viewed

See more projects on Devfolio