LongChat Maybe a bug in the preprocess?

Maybe a bug in the preprocess?

Open Richar-Du opened this issue 1 year ago • 3 comments

Thanks for your awesome work so that the community can train LLM on very long context! However, I find that in the preprocess function, line https://github.com/DachengLi1/LongChat/blob/a824bda25c0082e60973c35c79b0f35d69c6be2d/longchat/train/fine_tune/train.py#L125 and line: https://github.com/DachengLi1/LongChat/blob/a824bda25c0082e60973c35c79b0f35d69c6be2d/longchat/train/fine_tune/train.py#L137 will set target to: [1, -100, -100, ...], with the first element is not ignored. I think Fastchat gives the correct code, which is first setting target[:cur_len] = IGNORE_TOKEN_ID so the target will be [-100, -100, -100, ...]. Am I right? @DachengLi1

Jul 22 '23 04:07 Richar-Du

LongChat LongChat copied to clipboard

Maybe a bug in the preprocess?

LongChat
LongChat copied to clipboard