llama.cpp
llama.cpp copied to clipboard
Use `tokenizer.vocab_size()` instead of hardcoding 32000 when converting
When converting the model + tokenizer, use the vocabulary size returned by the tokenizer rather than assuming 32000.
There are ways that special tokens or other new tokens could be added to the tokenizer; therefore it's probably best not to assume the vocabulary is only 32000 tokens.