meet-cjli
meet-cjli
I meet the same problem and my Pytorch version is 1.6.0.
Currently, the tokenizer will automatically add another 'bos_token_id' for the input prompt. Since the prompt contains the 'bos_token_id', the result input_id contains two 'bos_token_id'. > https://github.com/lm-sys/FastChat/blob/main/fastchat/train/train_with_template.py#L101C1-L107C16
@MrZhengXin @Oscarjia It works, but it seems to cause another issue in the following function (https://github.com/lm-sys/FastChat/blob/main/fastchat/train/train_with_template.py#L144) When we use `user_turn_separator` to split the conversation, the first item would be ``...