Extractive Text Summary Generation and NER

The scope of this project is to make computer based automated production of condensed versions of documents. This automation is necessary for the information driven society.


The problem Extractive Text Summary Generation and NER solves

Automated summary generation of text using Python, GloVe algorithm, NLP and statistical modelling resulting in the summary of size 20% of original document. Trained an NLP pipeline for entity recognition like date and time, currencies, organizations,
person and locations out of resumes, research documents, articles, blogs, webpages etc.

Challenges I ran into

SpaCy provides a default Named Entity Recognition NLP pipeline to recognize the entities, but it is not very efficient. Developed and trained a custome NLP Named entity recognition pipeline model over the annotated data of text sentences and job resume samples.