stanford_alpaca
stanford_alpaca copied to clipboard
How to modify llama-Xb-hf/tokenizer_config.json from HuggingFace?
As title, I found that the content of llama-Xb-hf/tokenizer_config.json is like the following,
{"bos_token": "", "eos_token": "",
"model_max_length": 1000000000000000019884624838656,
"tokenizer_class": "LLaMATokenizer", "unk_token": ""}
How did your team modify this file so that the experiment can be run successfully?
Here is my modification. Is this correct?
{"bos_token": "<s>", "eos_token": "</s>",
"model_max_length": 1000000000000000019884624838656,
"tokenizer_class": "LlamaTokenizer", "unk_token": "<unk>"}
the same issue