In recent times, we have seen a drastic increase in online toxic content, hate speeches targeting a community,misleading tweets ,and non-contextual and abusive comments, which could be a great start to fire up Protests and Riots. CAA-NRC and Farmer's Bill are great examples of such cases.
Also, We have observed a great increase in violent and pornography images, making the place inappropriate for users.
As a Solution, We came up with our Deep Learning Solution, INTERCEPT AI, a B2B model acting as a layer between user interface and backend. That can filter all kinds of Inappropriate and Toxic Contents and warns the user regarding the act.
Features that we offer:
Specific Word Restriction: Using OpenAI GPT-3 Embeddings tool to compare input text context similarity with Blocked Words
Multilingual Abuse Detection: Trained on MURIL BERT model and is able to detect abusive content of different Indian languages like Hindi, Marathi, Assamese, Kanada, Malayalam and many more.
Violent Adult Content Detection: Trained using CNN ,Transfer learnIng using RESNET50
Toxic English Hate Speech Detection: Trained using Bi-Directional LSTMs
Training the model with such huge data was a very challenging task on our system. So we used Kaggle and Google Colab GPUs.
Deployment of a MURIL BERT model was a bit challenging task.
Deployment of English Toxic Hate Speech Model faced the difficulty of determining Tokenizer while deploying, So we passed and used Saved Tokenizer that we used for training.
Discussion