Our group's idea was to use summarization methods for an office environment. It is used to summarise meetings by converting speech to text. The text file is then sent to the summariser to generate a summary. We have also built an optical charcter recognition system so that we can extract text data from pdf and convert it to a txt file and this txt file will also be fed to the summarizer.
Running the Tesseract and other related libraies we ran into dependency issues and finally we overcame it by using Linux and was not working with Windows. Moreover we faced some difficulties with ML model integration with HTML rendered webpage.
Technologies used
Discussion