Skip to content

Mitudru Dutta

@MitudruDutta

Data Scientist | ML & Predictive Analytics Developer | Python | Building intelligent solutions

Data Scientist | ML & Predictive Analytics Developer | Python | Building intelligent solutions

Skill iconPython
Machine Learning
Deep Learning
Natural Language Processing

Kolkata, India

🚀 AI Engineer | Machine Learning Engineer | Data Scientist

🎯 Mission

Building production-grade AI/ML solutions that solve real-world problems. Specializing in Generative AI, Large Language Models, Predictive Analytics, and Autonomous AI Agents.


💼 Core Expertise

🤖 AI & Machine Learning

  • Generative AI & LLMs: LangChain, LangGraph, Prompt Engineering, RAG Systems, Fine-tuning (LoRA, QLoRA)
  • Agentic AI: Multi-agent systems, Tool calling, Reasoning chains, Model Context Protocol (MCP)
  • Deep Learning: TensorFlow, PyTorch, Keras, Custom architectures
  • Classical ML: Scikit-learn, XGBoost, Random Forests, Ensemble methods

📊 Data Science & Analytics

  • Feature Engineering: Statistical analysis, Dimensionality reduction, Feature selection
  • Model Optimization: Hyperparameter tuning, Cross-validation, SHAP analysis
  • Time Series: ARIMA, Prophet, LSTM for forecasting
  • Data Pipeline: ETL, Data warehousing, Big Data processing

🔧 Production & Deployment

  • API Development: FastAPI, Flask, Django
  • ML Operations: MLflow tracking, Model versioning, A/B testing
  • Cloud & DevOps: Docker, GitHub Actions, CI/CD pipelines
  • Databases: PostgreSQL, MySQL, Redis, Supabase, Vector DBs

🌟 Pinned Projects

📈 Credit Card Risk Modelling | ML Production System

  • Built end-to-end ML pipeline for credit default prediction
  • Tech: Scikit-learn, XGBoost, Feature Engineering, Streamlit deployment
  • Impact: Predictive model with 89% AUC on unseen data
  • Website

🏥 Healthcare Premium Prediction | Regression ML

  • ML model for predicting insurance premium rates
  • Tech: Pandas, NumPy, Scikit-learn, Statistical modeling
  • Demonstrates feature engineering for healthcare domain
  • Website

🤓 LearnLock | AI-Powered Learning Engine

  • Adversarial learning system using LLM as knowledge validator
  • Tech: LangChain, OpenAI API, Vector embeddings, FastAPI
  • Validates understanding through contradiction detection & spaced repetition
  • Try Live Demo | PyPI Package

🏘️ Real Estate Sentiment Tracker | NLP + Data Analytics

  • Sentiment analysis on real estate market trends
  • Tech: NLTK, Pandas, Web scraping, Visualization
  • GitHub Repository

📄 Paper | ML Research Implementation

  • Research paper implementation project exploring machine learning algorithms
  • Tech: Python, Scikit-learn, NumPy, Data Processing Pipelines
  • Focus: Implementing theoretical concepts from academic papers
  • GitHub Repository

🔧 DamageX | Computer Vision & ML

  • Damage assessment platform using computer vision and deep learning
  • Tech: TypeScript, PyTorch, CNN, Image Processing
  • Impact: 91% accuracy in damage detection
  • Website

📚 Tech Stack

🐍 Languages: Python 3.9+, SQL, R (basic) 🤖 AI/LLMs: OpenAI, LangChain, LangGraph, Hugging Face, Groq 📊 ML Libraries: TensorFlow, PyTorch, Scikit-learn, XGBoost, Keras 📈 Data: Pandas, NumPy, SciPy, Matplotlib, Seaborn, Plotly 🗄️ Databases: PostgreSQL, MySQL, Redis, Supabase, Vector DBs (Pinecone, ChromaDB) ⚙️ Production: FastAPI, Flask, Django, Streamlit, Docker ☁️ DevOps: GitHub Actions, Git, MLflow, Hugging Face Hub

Python PyTorch TensorFlow Scikit-learn Pandas NumPy FastAPI PostgreSQL Redis Docker Git


📈 Stats & Contributions

  • 532+ GitHub contributions in 2024 | Learning in public approach
  • 50+ ML/AI projects across data science, deep learning, and generative AI
  • Open-source contributor with published packages
  • Hackathon builder - Developed multiple AI solutions in competitive environments