llm.c
llm.c copied to clipboard
void tokenizer_init failed
allocated 474 MiB for model parameters train_gpt2fp32cu: train_gpt2_fp32.cu:1815: void tokenizer_init(Tokenizer*, const char*): Assertion `header[1] == 1' failed. [1] 2229854 abort (core dumped) ./train_gpt2fp32cu
Try to rebuild your data files with the train_gpt3.py. The tokenizer headers have changed.
Hey @Bing1002 if you're not facing this problem any more feel free to close the issue, try and rerun python script and then run your C script again and it should be fine.
Thank you for your response.