composer
composer copied to clipboard
Focal Loss, Taylor Cross Entropy Loss, SnapMix, Adaptive Gradient Clipping
New Method
Motivation
All of the above mentioned methods either improve the final models accuracy or reduce the training time for achieving the final accuracy. I don't how which of those you already planned to implement but these are the ones I use frequently for training models, that are not part of your library.
[Optional] Implementation
I just stumbled across this repo and I'm not 100% familiar with your implementation process. If it's for loss functions just to add the respective function in this place https://github.com/mosaicml/composer/blob/dev/composer/loss/loss.py and for algorithms in this place https://github.com/mosaicml/composer/tree/dev/composer/algorithms I might just create PRs for the desired methods?
Thanks @alexriedel1 , for suggesting these. Yes, for the loss functions its just adding them to loss.py
. We can take care of algorithm-izing them a bit later.
For SnapMix
, see https://github.com/mosaicml/composer/blob/dev/composer/algorithms/cutout/cutout.py for an example implementation of a data augmentation technique and pattern there, should be doable.
For AGC, we have implemented basic gradient clipping here (https://github.com/mosaicml/composer/blob/dev/composer/trainer/trainer.py#L1215), but could be enhanced with the AGC version!
@alexriedel1 , just an update -- we implemented adaptive gradient clipping in #924, give it a try/review and let us know if it helps!
Closing. Tracking elsewhere as low pri. We're open to community suggestions!