wl-coref
wl-coref copied to clipboard
Reduce training memory requirement
CUDA-enabled machine (48 GB to train, 4 GB to evaluate)
@vdobrovolskii friendly ping Are 48GB really needed to train? Can't we train longer (how long) with less ? couldn't your project leverage FP16, FP8 and other optimizations ? You can get them out of the box if you use roberta from the Transformers library https://github.com/huggingface/transformers Also there is accelerate https://huggingface.co/docs/accelerate/index
I have a 3070 with 8GB of GDDR6 :/
It should be totally possible to reduce the training requirements, and I've been thinking a long time about rewriting the project (because it grew out of another project and there's a lot of legacy in it) using Pytorch Lightning to allow for easy access to the optimizations, multi-gpu training, etc.
I just haven't had time to do that yet :(
Happy to hear that :) No worries you don't owe us anything but that would be great if you find the time/energy/will, plz ping me if that happen someday !