KlearStack

Get it Assessed!

Built at Unscript 2k23

Winner

Created on 14th February 2023

•

KlearStack

Get it Assessed!

The problem KlearStack solves

Our website provides a interface where user can upload of a image of a document either hand clicked or scanned doesn't matter, and we say if it is readable / content extractable or not, it classifies any image into 4 classes very bad, bad, good and very good.

Challenges we ran into

During the implementation of Optical Character Recognition (OCR) document classification model, several challenges were encountered. One of the main challenges was finding a solution to accurately recognize characters in the image. The solution was found by breaking down the image into smaller features through feature extraction and then searching for all the appropriate features needed for OCR. Another challenge was redirecting the frontend user interface in React. The use of Convolutional Neural Networks (CNNs) was not possible due to the limited size of the dataset, and data augmentation techniques also proved to be ineffective due to limited memory. To overcome this, the dataset was increased by searching for images individually and adding noise, grain, brightness, and darkness to the same images. These challenges were successfully overcome to produce a robust OCR document classification system. Additionally, another challenge was in implementing various machine learning algorithms to find the best fit for the OCR document classification model. This involved experimenting with different algorithms, including traditional machine learning models, deep learning models, and ensemble modeling techniques. The process of finding the optimal algorithm was time-consuming and required a lot of trial and error, but ultimately led to a better performing OCR document classification model. The use of ensemble modeling also helped to improve the accuracy of the system by combining the predictions of multiple models. All of these challenges were overcome through careful experimentation and analysis, leading to a high-performing OCR document classification model that can accurately classify and categorize document images.

Tracks Applied (1)

AI/ML

This project utilizes machine learning algorithms to accurately recognize the characters in the image and classify them ...Read More

Technologies used

Flask

TensorFlow

OpenCV

Keras CNN

reactjs

sklearn

Discussion

Builders also viewed

See more projects on Devfolio