Varun Parthasarathy
Varun Parthasarathy
@douxiaotian what kind of results did you get? I'm currently trying out a cyclical learning rate with SGD, but it'll take a while to finish training. I'm planning to try...
Yeah, I think so too. I'm currently downloading the Deepglint dataset (Cleaned MS-Celeb + Asian Celeb; ~7 million images, ~180,000 identities) - my previous experiment with SGD failed miserably. I'll...
@xlphs your results seem promising! Just to clarify, what dataset did you finetune on? Also, have you tried training from scratch at any point?
@kifaw the idea of validation is to see how well the model generalizes on data it hasn't seen before, so if the overlap is still present, then there will be...
I ran a learning rate range test a while back; the results are interesting - data:image/s3,"s3://crabby-images/15f0d/15f0d4fcb3978862b308673a028ffad7671a08d4" alt="final_plot" Does this mean larger learning rates would perform well? Can someone clarify this? This...
@kifaw that's something I unfortunately don't understand myself. I'm running some more range tests right now using the FaceNet triplet selection method, but I find it strange that the learning...
I guess there were some issues with the range test (I didn't run it for long enough). I ran it for about 20000 steps and got a more reasonable range...
@neklom the range test essentially involves slowly _increasing_ the learning rate over time, while tracking loss vs. learning rate. At a certain value of the learning rate, loss falls drastically...
@xlphs From my experience, training seems to become unstable once accuracy crosses 0.9 - the validation rate starts fluctuating wildly between 0.2 and 0.5. I generally stop training at this...
Training from scratch with triplet loss gives an accuracy of about 92.5% (similar to OpenFace), while validation tends to vary between 35 to 40%, even after 800k iterations. I guess...