F

Fake News Detection using Machine Learning

A Machine Learning Project to successfully detect and hence, curb the spread of Fake News with the help of Natural Language Processing.

128
F

Fake News Detection using Machine Learning

A Machine Learning Project to successfully detect and hence, curb the spread of Fake News with the help of Natural Language Processing.

The problem Fake News Detection using Machine Learning solves

In today's World, Fake News has been spreading like Wildfire, with people having unlimited access to information on the Internet and Social Media Platforms. If Fake News gets circulated on a large scale, then it can lead to serious consequences such as Mob Lynching and thus, mislead the common folk. To solve this issue and curb the spread of Fake News, a Machine Learning model was built to successfully detect Fake News articles with the help of Natural Language Processing. The model was trained on a large dataset consisting of True as well as Fake News articles and Suppport Vector Classifier model was found to give the highest accuracy of 99.7% for classifying a News Article as 'Fake' or 'True'. To put this tool to practical use and detect Fake News in real time, the model was deployed to a Web Application using Flask, where one can enter a News Article to check its authenticity and it will be classified as 'Fake' or 'True'.

Challenges I ran into

Difficulty was faced in performing data cleaning at the initial stage on the dataset comprising of over 60,000 News Articles. A lot of approaches were tried, such as removing the urls in the article, the tags and Stop Words and the method giving the best result was adopted. One of the hurdles was also about identifying the best ML model, which will be suitable for the data and can classify articles with high accuracy. So to solve this issue, various ML models such as Naive Bayes, Random Forest Classifier, Decision Tree Classifier, Support Vector Classifier were used and these models were evaluated and the one with the best results and highest accuracy was chosen.

Discussion