Sardar Patel Institute of Technology (SPIT), Mumbai
B.Tech in Computer Science with AI-ML Specialization | CGPA: 9.34 | Nov 2022 - Present
SP Jain Institute of Management & Research (SPJIMR)
Minor in Management | Feb 2024 - Present
Research Assistant | SPIT, Mumbai
Project: Vehicle and Crack Detection System
Built a high-performance Vehicle Detection model using Vision Transformer (ViT), achieving 92.8% accuracy on 16,185 images.
Developed a Crack Detection model with YOLOv9 and ViT, reducing false positives to improve structural safety analysis.
Gen-AI Intern | SPIT, Mumbai
Project: Fine-Tuning LLaMA-2 with LoRA
Fine-tuned LLaMA-2 on a Java dataset, enhancing its ability to generate adaptive, topic-specific questions.
Integrated MongoDB to log responses, supporting detailed feedback and user progress tracking.
Utilized LLaMA-3-8b for answer validation with high relevance and adaptive difficulty.
Advanced Image Segmentation Models
EffUNet for Road and Building Detection: Achieved mIoU of 0.8365 for buildings and 0.9153 for roads in aerial imagery.
UNet for Biomedical Segmentation: Developed for cell nuclei detection, optimized for reliable medical diagnostics.
AI and Vision-Based Models
Image-Based OCR Query System (Qwen2VL): Local OCR system for image analysis on 16GB RAM.
Gesture Recognition System: Created with MediaPipe and MLP, lightweight deployment using TFLite.
NLP and Conversational Agents
SQL Query Assistant: Converts natural language to SQL using LangChain and RAG, integrated with PostgreSQL, MongoDB, and Google Gemini.
Chat with PDF: Enables contextual summarization and extraction across PDFs using RAG and LangChain.
Emotion Recognition on Tweets: RNN-based model classifying tweets into emotional categories with LSTM layers.
Lyrics Recommender System: Content-based song recommendations using TF-IDF and cosine similarity.
1st Place at IIIT Nagpur Genathon 2.0 (National Level): Awarded for innovative solutions utilizing advanced AI technologies.
Amazon ML Challenge Hackathon: Ranked 201/75,000 participants with expertise in PaddleOCR, QwenV2, and RAG for image analysis and data processing.
M# Manipal Hackathon (Top 28): Recognized for developing optimized, real-world problem-solving strategies.
Smart India Hackathon (Top 25): Developed "Rail Madad," a smart real-time rail track monitoring system using YOLO and BERT.
VCET Hackathon (Rank 6):Created "Student-Mentee Connect," a user engagement platform using text-to-SQL, RAG, and Django.
"Vehicle Detection Using Vision Transformer"
Submitted to IEEE Conference at IIIT Allahabad under Prof. Vaishnavee Rathodβs guidance, achieving 92.81% validation accuracy.
Languages: Python (NumPy, Pandas, SpaCy, Keras, NLTK, OpenCV), C, C++, HTML, CSS
Frameworks & Databases: TensorFlow, Scikit-Learn, Django, MySQL, MongoDB, PostgreSQL
Areas of Interest: Machine Learning, Deep Learning, NLP, Computer Vision, Generative AI
Machine Learning A-Z | Udemy
Artificial Intelligence A-Z | Udemy
Road and Building Segmentation
Biomedical Image Segmentation
Image-Based OCR Query System
Hand Gesture Recognition
SQL Query Assistant
Chat with PDF
Emotion Recognition on Tweets
Lyrics Recommender System
Email: [email protected]
LinkedIn: Vinayak Bhatia
GitHub: vvinayakkk