openhathi_instruct
openhathi_instruct copied to clipboard
Tokenizer issue in Hathi Model
AutoTokenizer and LlamaTokenizer (which Sarvam used) both behave differently with this model. AutoTokenizer sometimes splits words that are in vocab and LlamaTokenizer works fine.
https://huggingface.co/sarvamai/OpenHathi-7B-Hi-v0.1-Base/discussions/5