werruww
werruww
https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3/discussions/87
Extended vocabulary to 32768 https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3
I ran the code. on colab t4 12 ram
ValueError Traceback (most recent call last) [](https://localhost:8080/#) in () 1 from accelerate import load_checkpoint_and_dispatch 2 ----> 3 model = load_checkpoint_and_dispatch( 4 model, checkpoint=weights_location, device_map="auto", no_split_module_classes=['Block'] 5 ) 2 frames [/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py](https://localhost:8080/#)...
| { -- | -- | "architectures": [ | "MistralForCausalLM" | ], | "attention_dropout": 0.0, | "bos_token_id": 1, | "eos_token_id": 2, | "hidden_act": "silu", | "hidden_size": 4096, | "initializer_range": 0.02, ...
https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3/discussions/88
from huggingface_hub import snapshot_download checkpoint = "openai-community/gpt2" weights_location = snapshot_download(repo_id=checkpoint) import torch.nn as nn # import the torch.nn module and alias it as nn from accelerate import init_empty_weights with init_empty_weights():...
colab no t4 no tpu
How do I create a model without a family gpt and without minGPT like mistral, phi3.5,lama3.1,qwen
https://github.com/werruww/run-prompt-with-accelerate