Paul Richardson comments

Results 13 comments of


                                            Paul Richardson

GPT4 x Alpaca

don't quote me on this but I think either your groupsize is off or you don't have gptq setup right... I think...

Cuda Out of Memory...?

With only 4GB of VRAM @bloodsign is probably right... you'll be OOM with most anything. Regular offloading to CPU is *usually* pretty slow with one exception... Llama.cpp I'd recommend looking...

4bit LoRA "--monkey-patch" breaks "--gpu-memory" Model Splitting for Multi-GPU

Ok so I managed to get the model to LOAD by using your suggestion just change `monkey_patch_gptq_lora.py` as indicated below ```python def load_model_llama(model_name): config_path = str(Path(f'{shared.args.model_dir}/{model_name}')) model_path = str(find_quantized_model_file(model_name)) >...

4bit LoRA "--monkey-patch" breaks "--gpu-memory" Model Splitting for Multi-GPU

> I thought that not splitting "LlamaDecoderLayer" was enough is it not? I only did offloading to CPU with this. If by not splitting "LlamaDecodeLayer" you mean modifying `autograd_4bit.py` on...

Get AttributeError: 'NoneType' object has no attribute 'encode' when trying to load models

not trying to be rude but you gotta give us more to work on... can you try uploading 1. Complete log from prompt to prompt 2. Screenshot 3. System Info...

Add multi-GPU support to train

This would be a killer feature... I agree

Add multi-GPU support to train

> I suggest using the training script in https://github.com/tloen/alpaca-lora directly. > Multigpu requires torchrun, which is a mutiprocess structure too hard to manage in a webui. You should use a...

Add multi-GPU support to train

> I suggest using the training script in https://github.com/tloen/alpaca-lora directly. Multigpu requires torchrun, which is a mutiprocess structure too hard to manage in a webui. You should use a script...

"Transformers bump" commit ruins gpt4-x-alpaca if using an RTX3090: model loads, but talks gibberish

So... tldr new transformer breaks quants. the patch is to change the contents of special_tokens_map.json and tokenizer_config.json to match content of ooba here https://github.com/oobabooga/text-generation-webui/issues/931#issuecomment-1501259027 ?

"Transformers bump" commit ruins gpt4-x-alpaca if using an RTX3090: model loads, but talks gibberish

~~... I'm still getting gibberish~~ I got it by 1. downloading the model from https://huggingface.co/Neko-Institute-of-Science/LLaMA-65B-HF/tree/main 2. replacing `special_tokens_map.json` and `tokenizer_config.json` with the ones here https://huggingface.co/chavinlo/gpt4-x-alpaca