let's say you are a youtube creator and you have lots of millions of subscribers and views on your channel, so when you upload a particular video you will get lots of comments in that video and now you start reading the comments one by one, you will found that most of the comments are positive ones like for e.g if you have uploaded an educational video the positive comments will be like 'thanks sir', 'you are the greatest sir !', 'awesome video', 'nice content', etc
but these types of too many positive comments are useless for a creator, they need comments which comes into the category of suggestions so that by reading such comments they can improve their channel content and they also need comments which comes into the category of questions so that the creator can answer the questions asked by their viewers
currently youtube provides no such filter through which you can classify your comments into suggestions or questions
so to solve the problem we have build an machine learning model which can classify the comments into four different categories 'suggestion', 'question', 'positive', 'other' and then we will output the classified comments on the website
so basically we will just ask you for the youtube video link, and then we will scrap all your comments from that video and then will classify it with the help of our machine learning model and then outputs it
The first step of our project was to buid the model which can classify the comments into different categories such as 'suggestion', 'question', 'positive', 'other' and to train a model we need data where the comments are labeled as per our need, and we found no such data on internet, so to tackle our first problem we created our own small dataset where we manually labeled the comments into four different categories and the dataset consists of 2480 comments labeled into four different categories 'suggestion', 'question', 'positive', 'other'
The second problem we face was during loading the model into api, it took our some time
Third problem was when we were uploading our project files on github, it did'nt get uploaded because the model was exceding the max size upload limit of github so the temporary solution we came up with was to run the project files on local machine only
Technologies used
Discussion