Mitudru Dutta
@MitudruDutta
Mitudru Dutta
@MitudruDutta
Data Scientist | ML & Predictive Analytics Developer | Python | Building intelligent solutions
Data Scientist | ML & Predictive Analytics Developer | Python | Building intelligent solutions
Kolkata, India
🚀 AI Engineer | Machine Learning Engineer | Data Scientist
🎯 Mission
Building production-grade AI/ML solutions that solve real-world problems. Specializing in Generative AI, Large Language Models, Predictive Analytics, and Autonomous AI Agents.
💼 Core Expertise
🤖 AI & Machine Learning
- Generative AI & LLMs: LangChain, LangGraph, Prompt Engineering, RAG Systems, Fine-tuning (LoRA, QLoRA)
- Agentic AI: Multi-agent systems, Tool calling, Reasoning chains, Model Context Protocol (MCP)
- Deep Learning: TensorFlow, PyTorch, Keras, Custom architectures
- Classical ML: Scikit-learn, XGBoost, Random Forests, Ensemble methods
📊 Data Science & Analytics
- Feature Engineering: Statistical analysis, Dimensionality reduction, Feature selection
- Model Optimization: Hyperparameter tuning, Cross-validation, SHAP analysis
- Time Series: ARIMA, Prophet, LSTM for forecasting
- Data Pipeline: ETL, Data warehousing, Big Data processing
🔧 Production & Deployment
- API Development: FastAPI, Flask, Django
- ML Operations: MLflow tracking, Model versioning, A/B testing
- Cloud & DevOps: Docker, GitHub Actions, CI/CD pipelines
- Databases: PostgreSQL, MySQL, Redis, Supabase, Vector DBs
🌟 Pinned Projects
📈 Credit Card Risk Modelling | ML Production System
- Built end-to-end ML pipeline for credit default prediction
- Tech: Scikit-learn, XGBoost, Feature Engineering, Streamlit deployment
- Impact: Predictive model with 89% AUC on unseen data
- Website
🏥 Healthcare Premium Prediction | Regression ML
- ML model for predicting insurance premium rates
- Tech: Pandas, NumPy, Scikit-learn, Statistical modeling
- Demonstrates feature engineering for healthcare domain
- Website
🤓 LearnLock | AI-Powered Learning Engine
- Adversarial learning system using LLM as knowledge validator
- Tech: LangChain, OpenAI API, Vector embeddings, FastAPI
- Validates understanding through contradiction detection & spaced repetition
- Try Live Demo | PyPI Package
🏘️ Real Estate Sentiment Tracker | NLP + Data Analytics
- Sentiment analysis on real estate market trends
- Tech: NLTK, Pandas, Web scraping, Visualization
- GitHub Repository
📄 Paper | ML Research Implementation
- Research paper implementation project exploring machine learning algorithms
- Tech: Python, Scikit-learn, NumPy, Data Processing Pipelines
- Focus: Implementing theoretical concepts from academic papers
- GitHub Repository
🔧 DamageX | Computer Vision & ML
- Damage assessment platform using computer vision and deep learning
- Tech: TypeScript, PyTorch, CNN, Image Processing
- Impact: 91% accuracy in damage detection
- Website
📚 Tech Stack
🐍 Languages: Python 3.9+, SQL, R (basic) 🤖 AI/LLMs: OpenAI, LangChain, LangGraph, Hugging Face, Groq 📊 ML Libraries: TensorFlow, PyTorch, Scikit-learn, XGBoost, Keras 📈 Data: Pandas, NumPy, SciPy, Matplotlib, Seaborn, Plotly 🗄️ Databases: PostgreSQL, MySQL, Redis, Supabase, Vector DBs (Pinecone, ChromaDB) ⚙️ Production: FastAPI, Flask, Django, Streamlit, Docker ☁️ DevOps: GitHub Actions, Git, MLflow, Hugging Face Hub
📈 Stats & Contributions
- 532+ GitHub contributions in 2024 | Learning in public approach
- 50+ ML/AI projects across data science, deep learning, and generative AI
- Open-source contributor with published packages
- Hackathon builder - Developed multiple AI solutions in competitive environments