trl
trl copied to clipboard
Tokenizer bos token problem when using KTO trainer
While using the KTO trainer, I observed an issue. Upon further debugging, I identified the code where the bos_token is added to the beginning of the _input_ids. However, some tokenizers lack a bos_token, resulting in None instead of the expected bos_token.
You guys can use below code to check if there is any bos token or not:
if self.tokenizer.bos_token_id != None:
batch[f"{prefix}prompt_input_ids"] = [self.tokenizer.bos_token_id] + all_tokens["prompt_input_ids"]
batch[f"{prefix}prompt_attention_mask"] = [1] + all_tokens["prompt_attention_mask"]
batch[f"{prefix}completion_input_ids"] = (
[self.tokenizer.bos_token_id]
+ all_tokens["prompt_input_ids"]
+ all_tokens["answer_input_ids"]
+ [self.tokenizer.eos_token_id]
)
batch[f"{prefix}completion_attention_mask"] = (
[1] + all_tokens["prompt_attention_mask"] + all_tokens["answer_attention_mask"] + [1]
)
else:
batch[f"{prefix}prompt_input_ids"] = all_tokens["prompt_input_ids"]
batch[f"{prefix}prompt_attention_mask"] = all_tokens["prompt_attention_mask"]
batch[f"{prefix}completion_input_ids"] = (
all_tokens["prompt_input_ids"]
+ all_tokens["answer_input_ids"]
+ [self.tokenizer.eos_token_id]
)
batch[f"{prefix}completion_attention_mask"] = (
all_tokens["prompt_attention_mask"] + all_tokens["answer_attention_mask"] + [1]
)
Concretely, consider the Qwen 7B model
from transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B-chat", trust_remote_code=True)
assert tokenizer.bos_token_id is None # This is true
The presence of a BOS token is strictly not necessary for DPO / KTO. Hence the presence should be optional.
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.