airllm
airllm copied to clipboard
pytorch_model.bin.index.json should exists.
请问如何处理这个错误:f"{hf_cache_path}/pytorch_model.bin.index.json should exists."。
Can you provide more info? Which hf model repo ID are you using?
Also, can you check if you have enough disk space?
Hi @lyogavin ,
Which hf model repo ID are you using?
It's a 7B llama2 model The repo is https://huggingface.co/bkai-foundation-models/vietnamese-llama2-7b-40GB This is gated however. So, I copy the output of loaded files in repo here for you to easy to track: Fetching 16 files: 100% 16/16 [01:28<00:00, 15.48s/it]
- config.json
- figure/training_loss.png
- README.md
- generation_config.json
- .gitattributes
- model.safetensors.index.json
- pt_lora_model/adapter_config.json
- pt_lora_model/special_tokens_map.json
- pt_lora_model/tokenizer_config.json
- special_tokens_map.json
- tokenizer.json
- model-00001-of-00002.safetensors
- model-00002-of-00002.safetensors
- adapter_model.bin
- tokenizer.model
Also, can you check if you have enough disk space?
Yes, we are running on a colab and have enough disk space (loaded 100%), and more disk left.
Thank you for your project.
OK... It's a LORA model...
We'll look into how to support this.
Thanks.
Same issue for airllm "...pytorch_model.bin.index.json should exists" with HF model "argilla/notus-7b-v1-lora". Could you please fix this? Thanks
Same issue for airllm "...pytorch_model.bin.index.json should exists" with HF model "argilla/notus-7b-v1-lora". Could you please fix this? Thanks
This is a Mistral model, can you try the following:
from airllm import AirLLMMistral
MAX_LENGTH = 128
model = AirLLMMistral("argilla/notus-7b-v1-lora")
input_text = ['What is the capital of China?',]
input_tokens = model.tokenizer(input_text,
return_tensors="pt",
return_attention_mask=False,
truncation=True,
max_length=MAX_LENGTH)
generation_output = model.generate(
input_tokens['input_ids'].cuda(),
max_new_tokens=5,
use_cache=True,
return_dict_in_generate=True)
model.tokenizer.decode(generation_output.sequences[0])
Hi @lyogavin ,
Which hf model repo ID are you using?
It's a 7B llama2 model The repo is https://huggingface.co/bkai-foundation-models/vietnamese-llama2-7b-40GB This is gated however. So, I copy the output of loaded files in repo here for you to easy to track: Fetching 16 files: 100% 16/16 [01:28<00:00, 15.48s/it]
- config.json
- figure/training_loss.png
- README.md
- generation_config.json
- .gitattributes
- model.safetensors.index.json
- pt_lora_model/adapter_config.json
- pt_lora_model/special_tokens_map.json
- pt_lora_model/tokenizer_config.json
- special_tokens_map.json
- tokenizer.json
- model-00001-of-00002.safetensors
- model-00002-of-00002.safetensors
- adapter_model.bin
- tokenizer.model
Also, can you check if you have enough disk space?
Yes, we are running on a colab and have enough disk space (loaded 100%), and more disk left.
Thank you for your project.
Can you try again with the latest version of airllm? it should work now.