DataVerse
Own your data | Be in control | Start Earning
Created on 13th August 2023
•
DataVerse
Own your data | Be in control | Start Earning
The problem DataVerse solves
Due to the rise in generative models and AI in the last few years, more large-scale models are being trained on copyrighted and licensed data without adhering to proper copyright or license laws. This leaves the owner of the data hanging dry, while their data is being used and the companies get bombarded with a myriad of legal battles This is where our platform DataVerse comes in. Our goal is to provide a space where individuals can upload their data and consent to it being used as part of a large dataset for model training while getting compensated for it. At the same time, anyone training a model would essentially have access to a diverse repository of quality real-world data to choose from. How does this work? As a contributor, an individual would be able to upload their own data, say an image, with some required metadata. On uploading, the data is now part of the platform and the data is tokenized into an NFT which is held by the user. Now, if you are a researcher or a trainer, you can search for the kind of data you want and download them as your dataset in exchange for a fee in our native token to the provider. The moment the data is downloaded, the owners of the data are credited a certain number of tokens, thereby compensating them. This ensures a fair space where everyone wins and promotes transparency among big companies and the individuals with regards to how their data is used.
Challenges we ran into
Our platform uses a similarity detection algorithm which uses a CNN implementation with Pytorch. Given the time constraints, it was difficult to deploy the service, so we had to run it locally and expose it. This gave headway to a bunch of network issues resulting from high latency and consistently poor network conditions. We faced a lot of trouble sending the necessary tensor data over an API as it needed to be encoded in a binary format. This made it difficult to test the API over tools such as Postman.
We faced a ton of deployment issues for our backend, ultimately having to expose it locally. This created the same latency issues and with the network conditions being poor as they were, caused development to be stalled for quite long, at times.
Tracks Applied (3)
Ethereum + Polygon Track
Polygon
Filecoin
Filecoin
Web 3.0 Track
Technologies used
