openchat
openchat copied to clipboard
What is the meaning of padding-free in ReadMe?
In the readme, it says:
The OpenChat training system utilizes padding-free training and the Multipack Sampler, achieving a 3~10x speedup compared to the conventional padded training.
What is the meaning of padding-free here? Is there a need for all seqs in one batch to have the same length? If no padding, how is this done?
Thanks!
我的感觉是把短样本拼接为一个长样本