Problems: Everyone has an equal right to see the world, images but when it comes to the blind person, there are not many ways where they can also feel the image. Even for small children, learning from the images has been less practised. So to overcome this problem, we have come with a solution to image captioning.
Solution: It is used by the blind/disable and small kids to learn and understands what are present in the image, what are to be learned from it.
Not only this project can be helpful for various recommendation system and helps in recommending the new images based on the data collected from previous images.
Firstly, the training of the model was very challenging for us, which in our local CPU would take more than 3 hours so we trained them in Kaggle notebooks hiring its GPU. Then, using the converted text into Nepali audio form, was quite hard which were solved using the version-controlled google translator. Similarly, on deploying the Heroku, static files weren't detected that were solved using separate Django Heroku libraries.
Discussion