I am most proud of my summer internship project. I worked on Convolutional Neural Networks and embedded hardware. The task was to port a facial recognition neural network to Google's vision kit.
Why am I most proud of it?
I was able to convert FaceNet NN4 neural network into tensorflow and train it with 92% accuracy. the space requirement was reduced to just 32MB by using inception modules, which is less compared to the pre-existing 120MB model. This made the neural network able to run at 8 FPS, which is great a low cost embedded hardware.
What roadblocks did I face, and how I overcame them?
The available paper on FaceNet does not fully describe the hyperparameters to be used while training, so a lot of hits and tries had to be done. Each change in hyperparameters means 3 days of training time. I implemented "online triplet mining" loss function and batch normalization just from reading about them from a paper to boost the neural network's training time.
Then later with help from my colleagues, I was able to run the training on multiple GPUs by using parallel processing.