SpecForge
SpecForge copied to clipboard
[Feature] Support FP8 target model during training
Checklist
- [x] 1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/SpecForge/discussions/new/choose Otherwise, it will be closed.
- [x] 2. Please use English, otherwise it will be closed.
Motivation
please support FP8 target model during training to reduce GPU memory usage
Related resources
No response
@zyksir has already supported this. I think the feature will be merged soon.
@yubofredwang Could you please point out which PR are you referring to if there is one?
@yubofredwang I have the same issue. Could you please point it out? Current implementation doesn't support FP8(like finegrained_fp8) and simple type casting is not enough because the scale of activation is computed online. In distributed setting, each GPU has a different scale because they have different chunks of original data.