FastChat
FastChat copied to clipboard
Deepspeed support and config file?
Hi:
Can this model be trained with Deepspeed support? If yes, could anyone provide a workable Deepspeed config file?
Thanks.
BTW, I have tried to use a simple config setting as below: { "zero_optimization": true } I then observed about 20% increase in training speed. Honestly speaking, it's far from what I would expected (double or even triple the training speed). I think it's because the config file is incorrect.
The speedup is based on your gpu network topology and zero level / parallelization config. The zero optimization config is typically:
"zero_optimization": {
"stage": $STAGE_NUMBER_YOU_WANT_FROM_0_TO_3
}
It seems the training speed with Deepspeed isn't great. We'll add some better model-parallel training support soon. Closing this ticket.