Results 82 comments of imi

If I wanted to try and test this can I use the same settings in the colab that I would if I was running pygmalion or does anything need to...

Ahh. I was trying to run the 13B model on non pro colab, so that's why I got busted. Thanks for the info!

I was able to run it, I just ran out of memory when I tried to generate. Probably token size like you said. I'll try again tomorrow, but the 7B...

Doing the same thing as previously (changing the lines from first post,) now I get this error. Any idea what to do? ``` ===================================BUG REPORT=================================== Welcome to bitsandbytes. For bug...

Thanks, works. Oh and yeah I wanted to try the 13b model with the instructions you wrote, but I only have 12.7GB ram on colab which seems to bust at...

Repository Not Found for url: https://huggingface.co/models/llama-7b-hf/resolve/main/config.json Did they change something 🤔

@NoShinSekai Any idea why I get this error when I try to load 4bit models? ``` Loading llama-13b-hf-int4... Could not find the quantized model in .pt or .safetensors format, exiting......

I tried again and now I'm not getting that error, I must have missed something. Struggling to load a model in though This is the setup ![image](https://user-images.githubusercontent.com/15861396/229569697-c093a660-ea16-4d00-a61a-8eb057717453.png) I tried these...

@jllllll Tried that, the error changed to ``` Loading gpt4-x-alpaca-13b-native-ggml-model-q4_0... Traceback (most recent call last): File "D:\textgen\oobabooga-windows\text-generation-webui\server.py", line 302, in shared.model, shared.tokenizer = load_model(shared.model_name) File "D:\textgen\oobabooga-windows\text-generation-webui\modules\models.py", line 106, in load_model...

Same I think. Test 64gb ram 3950x context size 1058 alpaca 13b lora merged 28 threads - 0.54 tokens/s 20 - 0.69 16 - 0.62 12 - 0.62 8 -...