deita The length of samples

The length of samples

Open Ber666 opened this issue 9 months ago • 1 comments

It seems each sample in the deita dataset consists of a lot of turns and is super long (>10k tokens). Your paper mentioned the max length of input is 2048 for SFT. Does that mean most text of each training sample is truncated and discarded?

May 10 '24 05:05 Ber666

deita deita copied to clipboard

The length of samples

deita
deita copied to clipboard