NeSVoR
NeSVoR copied to clipboard
Efficiency with deepspeed
Possibly naive suggestion, I wonder if we can lower VRAM usage and/or improve speed using https://github.com/microsoft/DeepSpeed ?
Thanks for the suggestion. Actually we have already used some of those techniques, e.g., mixed precision training, to improve the usage of GPU memory and efficiency. Other techniques, such as offloading, are useful for large models but might not be necessary in our case. But there might be some new techniques that I am not aware of, so I will keep an eye on it.