airllm icon indicating copy to clipboard operation
airllm copied to clipboard

pytorch_model.bin.index.json should exists.

Open vuminhquang opened this issue 1 year ago • 6 comments

请问如何处理这个错误:f"{hf_cache_path}/pytorch_model.bin.index.json should exists."。

vuminhquang avatar Nov 28 '23 23:11 vuminhquang

Can you provide more info? Which hf model repo ID are you using?

Also, can you check if you have enough disk space?

lyogavin avatar Nov 29 '23 00:11 lyogavin

Hi @lyogavin ,

Which hf model repo ID are you using?

It's a 7B llama2 model The repo is https://huggingface.co/bkai-foundation-models/vietnamese-llama2-7b-40GB This is gated however. So, I copy the output of loaded files in repo here for you to easy to track: Fetching 16 files: 100% 16/16 [01:28<00:00, 15.48s/it]

  • config.json
  • figure/training_loss.png
  • README.md
  • generation_config.json
  • .gitattributes
  • model.safetensors.index.json
  • pt_lora_model/adapter_config.json
  • pt_lora_model/special_tokens_map.json
  • pt_lora_model/tokenizer_config.json
  • special_tokens_map.json
  • tokenizer.json
  • model-00001-of-00002.safetensors
  • model-00002-of-00002.safetensors
  • adapter_model.bin
  • tokenizer.model

Also, can you check if you have enough disk space?

Yes, we are running on a colab and have enough disk space (loaded 100%), and more disk left.

Thank you for your project.

vuminhquang avatar Nov 29 '23 03:11 vuminhquang

OK... It's a LORA model...

We'll look into how to support this.

Thanks.

lyogavin avatar Nov 29 '23 05:11 lyogavin

Same issue for airllm "...pytorch_model.bin.index.json should exists" with HF model "argilla/notus-7b-v1-lora". Could you please fix this? Thanks

mchaduteau avatar Dec 05 '23 22:12 mchaduteau

Same issue for airllm "...pytorch_model.bin.index.json should exists" with HF model "argilla/notus-7b-v1-lora". Could you please fix this? Thanks

This is a Mistral model, can you try the following:

from airllm import AirLLMMistral
MAX_LENGTH = 128
model = AirLLMMistral("argilla/notus-7b-v1-lora")
input_text = ['What is the capital of China?',]
input_tokens = model.tokenizer(input_text,
    return_tensors="pt", 
    return_attention_mask=False, 
    truncation=True, 
    max_length=MAX_LENGTH)
generation_output = model.generate(
    input_tokens['input_ids'].cuda(), 
    max_new_tokens=5,
    use_cache=True,
    return_dict_in_generate=True)
model.tokenizer.decode(generation_output.sequences[0])

lyogavin avatar Dec 06 '23 01:12 lyogavin

Hi @lyogavin ,

Which hf model repo ID are you using?

It's a 7B llama2 model The repo is https://huggingface.co/bkai-foundation-models/vietnamese-llama2-7b-40GB This is gated however. So, I copy the output of loaded files in repo here for you to easy to track: Fetching 16 files: 100% 16/16 [01:28<00:00, 15.48s/it]

  • config.json
  • figure/training_loss.png
  • README.md
  • generation_config.json
  • .gitattributes
  • model.safetensors.index.json
  • pt_lora_model/adapter_config.json
  • pt_lora_model/special_tokens_map.json
  • pt_lora_model/tokenizer_config.json
  • special_tokens_map.json
  • tokenizer.json
  • model-00001-of-00002.safetensors
  • model-00002-of-00002.safetensors
  • adapter_model.bin
  • tokenizer.model

Also, can you check if you have enough disk space?

Yes, we are running on a colab and have enough disk space (loaded 100%), and more disk left.

Thank you for your project.

Can you try again with the latest version of airllm? it should work now.

lyogavin avatar Dec 06 '23 01:12 lyogavin