Data is the new oil. We live in a time where AI is being used everywhere. To create State-of-the-art models we need an enormous amount of data. Only large tech companies can afford such data. To democratize AI, we need to democratize the data. We are giving the power back to people. Another problem Data Scientists face is a lack of variety of data. Famous academic datasets are disconnected from real-life problems. What if there is a community where people can ask for any dataset they want and people help each other from all over the world.
We at DataBash are making a crowdsourcing based Dataset collection platform. A user can request for any dataset and people can donate their photos. People will get rewarded for the data they will donate. Data Scientists can now focus on making beautiful models without worrying about the data required. We will make sure that only the best quality dataset reaches them. To ensure that we use Deep Learning for filtering out the dataset.
How we would handle the dataset itself and allow users to contribute was the biggest hurdle. Eventually dividing the problem into smaller tasks allowed us to think of a feasible solution. Also how we would handle different types of datasets was also an issue. Integrating the ML models for spam filtering also took some time as we tried several different methods.
Discussion