alpaca-lora icon indicating copy to clipboard operation
alpaca-lora copied to clipboard

Suggestion: Packing with ConstantLengthDataset

Open AngainorDev opened this issue 1 year ago • 1 comments

Did someone test Packing with ConstantLengthDataset?

Just heard of it there https://huggingface.co/blog/stackllama

Could be better suited than --group_by_len

AngainorDev avatar Apr 06 '23 16:04 AngainorDev

Just look at the code, does it affect the randomness, it seems we always take the sample in order within a single "iter" function in the dataset

allanj avatar Apr 10 '23 09:04 allanj