gpt-fast
gpt-fast copied to clipboard
Allow small modes to work with convert_hf_checkpoint. Added TinyLLama to the model list
Small models in HF don't have pytorch_model.bin.index.json files, since they are unnecessary. I changed the convert_hf_checkpoint.py to allow a single pytorch_model.bin file as the model description. I added PY007/TinyLlama-1.1B-intermediate-step-480k-1T to the the model list since it's in the speculate_7B_int4.sh script.
TinyLLama now works with the exception that weights_only would have to be changed to True on line 74 of convert_hf_checkpoint.py. I'll leave that up for discussion since it's less secure.