Implementation of OCR using open CV

It is a technology that recognizes text within a digital image. It is commonly used to recognize text in scanned documents and images.

Built at Unblock 2022

Created on 25th November 2022

•

Implementation of OCR using open CV

It is a technology that recognizes text within a digital image. It is commonly used to recognize text in scanned documents and images.

The problem Implementation of OCR using open CV solves

Word processing software cannot process text in photos the same way it does text documents. By transforming text photos into text data that can be evaluated by other business tools, OCR technology finds a solution to the issue. The electronic translation of printed, typed, or handwritten text images into machine-encoded text is known as optical character recognition (OCR). Many paper-based documents in many languages and formats can be digitised into machine-readable text using OCR. In addition to making storage simpler, this also makes previously inaccessible data clickable by anybody. Just picture all the paper-filled archive boxes that are stored in a government or city basement. Such pictures and documents can be scanned as either a scene photo, a document photo, or both. OCR stands for Optical Character Recognition. It is a technology that
recognizes text within a digital image. It is commonly used to recognize text in scanned documents and images. OCR software can be used to convert a physical paper document, or an image into an accessible electronic version with text. The document is submitted to our website, scanned, and then OCR is used to extract the text (Pytesseract package).Cross-verification of the extracted data from the OCR is done using the logger id.It will be allowed and accepted if the information from the OCR
extraction and the logger id information are same. If not, it will be rejected and sent to the appropriate authority.

Challenges we ran into

Initially the team used manual scanning, working on automatic scanning. camera mounting and pavement of document issues. Automatic flipper design asked to be made creating a frontend design of the OCR. Using IR sensors and placement of light. The current real-time problem is saving files and accessing them. The government must store a large number of files, which cannot be accessed as quickly when needed. Files are being lost due to a lack of storage, and the stored files are not secure.
The documents' accessibility is not achieved.

Technologies used

OpenCV

pyTesseract

Rasberry pi 4

Discussion

Builders also viewed

See more projects on Devfolio