Spam Detector Machine Learning Program

This Machine Learning Program detects a message/email wether it is a SPAM or HAM. Many people invest their time, money, personal information etc. Thus it helps in avoiding such situations.

160

Built at Hack the Hourglass

Created on 1st August 2020

•

Spam Detector Machine Learning Program

This Machine Learning Program detects a message/email wether it is a SPAM or HAM. Many people invest their time, money, personal information etc. Thus it helps in avoiding such situations.

The problem Spam Detector Machine Learning Program solves

It labels a message as SPAM or HAM, thus saving the user from falling into the trap of spammers (i.e spam messages/emails, phishing emails, fraud messages etc).
Many people fall in the trap of phishing emails, spam and may loose personal information, fills their inbox etc.
Spam messages may contain FAKE offers, win vouchers with a genuinely look from a genuine company.
People may contact the company which was targeted by the scammers and verify the offer / win voucher eventually knowing that it was just a spam.
Mostly spam messages are very difficult to recognize hence a software to detect spam messages is necessary espescially for high profile individuals.
Thus people waste investing their time, money, personal information etc. and thus becoming DEPENDENT(not aatm nirbhar) on others(company officials, foreign services, professionals)to verify it for them.
Our machine learning program combined with application tests using multiple models(Naive Bayes Classifier) and helps you in detecting fraud messages, spams etc.
Hence saving time, money , not loosing personal info and making us AATM NIRBHAR( not dependent on someone)

Challenges we ran into

We had to create a project which would be relevant to AATM NIRBHAR theme. So , we decided to create a model which will help each individual decide if the message/email is SPAM or HAM. Since each individual now dosen't have to depend on someone else to detect the genuinity of the message. So they are AATM NIRBHAR

We needed to download MULTIPLE DATASETS AND INTEGRATE THEM INTO ONE. Since the difference between SPAM messages and HAM messages can not be easily differentiated thus more datasets are required to train the models.

We have have made a STREAMLIT application which takes the message as input and shows result using two models.
We trained both the models using 80% training dataset and tested them with 20% dataset.

Naive Bayer Classifier (Accuracy 97.8%)
Support Vector Machines (Accuracy 93.2%)

We encountered some errors while making application(app.py and model.py) and loading dataset on the localhost server takes around 2 minutes to load(since it is large).

QUICK DEADLINE was pretty challenging. Our project / application was ready but filling the submission form in devfolio and uploading video on youtube(slow net speed) was a big problem.

Technologies used

scikit-learn

NumPy

Machine Learning

pandas

Python

GitHub

Streamlit.io

Discussion

Builders also viewed

See more projects on Devfolio