airllm pytorch_model.bin.index.json should exists.

请问如何处理这个错误：f"{hf_cache_path}/pytorch_model.bin.index.json should exists."。

Nov 28 '23 23:11 vuminhquang

Can you provide more info? Which hf model repo ID are you using?

Also, can you check if you have enough disk space?

Nov 29 '23 00:11 lyogavin

Hi @lyogavin ,

Which hf model repo ID are you using?

It's a 7B llama2 model The repo is https://huggingface.co/bkai-foundation-models/vietnamese-llama2-7b-40GB This is gated however. So, I copy the output of loaded files in repo here for you to easy to track: Fetching 16 files: 100% 16/16 [01:28<00:00, 15.48s/it]

config.json
figure/training_loss.png
README.md
generation_config.json
.gitattributes
model.safetensors.index.json
pt_lora_model/adapter_config.json
pt_lora_model/special_tokens_map.json
pt_lora_model/tokenizer_config.json
special_tokens_map.json
tokenizer.json
model-00001-of-00002.safetensors
model-00002-of-00002.safetensors
adapter_model.bin
tokenizer.model

Also, can you check if you have enough disk space?

Yes, we are running on a colab and have enough disk space (loaded 100%), and more disk left.

Thank you for your project.

Nov 29 '23 03:11 vuminhquang

OK... It's a LORA model...

We'll look into how to support this.

Thanks.

Nov 29 '23 05:11 lyogavin

Same issue for airllm "...pytorch_model.bin.index.json should exists" with HF model "argilla/notus-7b-v1-lora". Could you please fix this? Thanks

Dec 05 '23 22:12 mchaduteau

Same issue for airllm "...pytorch_model.bin.index.json should exists" with HF model "argilla/notus-7b-v1-lora". Could you please fix this? Thanks

This is a Mistral model, can you try the following:

from airllm import AirLLMMistral
MAX_LENGTH = 128
model = AirLLMMistral("argilla/notus-7b-v1-lora")
input_text = ['What is the capital of China?',]
input_tokens = model.tokenizer(input_text,
    return_tensors="pt", 
    return_attention_mask=False, 
    truncation=True, 
    max_length=MAX_LENGTH)
generation_output = model.generate(
    input_tokens['input_ids'].cuda(), 
    max_new_tokens=5,
    use_cache=True,
    return_dict_in_generate=True)
model.tokenizer.decode(generation_output.sequences[0])

Dec 06 '23 01:12 lyogavin

Hi @lyogavin ,

Which hf model repo ID are you using?

It's a 7B llama2 model The repo is https://huggingface.co/bkai-foundation-models/vietnamese-llama2-7b-40GB This is gated however. So, I copy the output of loaded files in repo here for you to easy to track: Fetching 16 files: 100% 16/16 [01:28<00:00, 15.48s/it]

config.json

figure/training_loss.png

README.md

generation_config.json

.gitattributes

model.safetensors.index.json

pt_lora_model/adapter_config.json

pt_lora_model/special_tokens_map.json

pt_lora_model/tokenizer_config.json

special_tokens_map.json

tokenizer.json

model-00001-of-00002.safetensors

model-00002-of-00002.safetensors

adapter_model.bin

tokenizer.model

Also, can you check if you have enough disk space?

Yes, we are running on a colab and have enough disk space (loaded 100%), and more disk left.

Thank you for your project.

Can you try again with the latest version of airllm? it should work now.

Dec 06 '23 01:12 lyogavin

airllm airllm copied to clipboard

pytorch_model.bin.index.json should exists.

airllm
airllm copied to clipboard