werruww

Results 204 comments of werruww

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3/discussions/87

Extended vocabulary to 32768 https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3

I ran the code. on colab t4 12 ram

ValueError Traceback (most recent call last) [](https://localhost:8080/#) in () 1 from accelerate import load_checkpoint_and_dispatch 2 ----> 3 model = load_checkpoint_and_dispatch( 4 model, checkpoint=weights_location, device_map="auto", no_split_module_classes=['Block'] 5 ) 2 frames [/usr/local/lib/python3.10/dist-packages/accelerate/utils/modeling.py](https://localhost:8080/#)...

  | { -- | --   | "architectures": [   | "MistralForCausalLM"   | ],   | "attention_dropout": 0.0,   | "bos_token_id": 1,   | "eos_token_id": 2,   | "hidden_act": "silu",   | "hidden_size": 4096,   | "initializer_range": 0.02,  ...

https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3/discussions/88

from huggingface_hub import snapshot_download checkpoint = "openai-community/gpt2" weights_location = snapshot_download(repo_id=checkpoint) import torch.nn as nn # import the torch.nn module and alias it as nn from accelerate import init_empty_weights with init_empty_weights():...

How do I create a model without a family gpt and without minGPT like mistral, phi3.5,lama3.1,qwen

https://github.com/werruww/run-prompt-with-accelerate