LongChat
LongChat copied to clipboard
Maybe a bug in the preprocess?
Thanks for your awesome work so that the community can train LLM on very long context! However, I find that in the preprocess
function, line
https://github.com/DachengLi1/LongChat/blob/a824bda25c0082e60973c35c79b0f35d69c6be2d/longchat/train/fine_tune/train.py#L125
and line:
https://github.com/DachengLi1/LongChat/blob/a824bda25c0082e60973c35c79b0f35d69c6be2d/longchat/train/fine_tune/train.py#L137
will set target
to: [1, -100, -100, ...]
, with the first element is not ignored. I think Fastchat gives the correct code, which is first setting target[:cur_len] = IGNORE_TOKEN_ID
so the target will be [-100, -100, -100, ...]
. Am I right?
@DachengLi1