text-generation-webui icon indicating copy to clipboard operation
text-generation-webui copied to clipboard

error with monkeypatch and model gpt-j 4bit and lora

Open ReDXeoL opened this issue 1 year ago • 5 comments

Describe the bug

error with monkeypatch and model gpt-j and lora

Hello, I would like to discuss a problem I have with only the 4bit quantized gpt-j models (gpt-j-6B-alpaca-4bit-128g) with the help of the AutoGPTQ method when using the monkeypatch to train with lora

(this only happens with this type of models, models like vicuna or WizardLM work very well with monkeypatch)

ERROR: Load Model ... WARNING:The safetensors archive passed at models\bertin-gpt-j-6B-alpaca-4bit-128g\gptq_model-4bit-128g.safetensors does not contain metadata. Make sure to save your model with the save_pretrained method. Defaulting to 'pt' metadata. Traceback (most recent call last): File "A:\LLMs_LOCAL\oobabooga_windows\text-generation-webui\server.py", line 932, in shared.model, shared.tokenizer = load_model(shared.model_name) File "A:\LLMs_LOCAL\oobabooga_windows\text-generation-webui\modules\models.py", line 153, in load_model model, _ = load_model_llama(model_name) File "A:\LLMs_LOCAL\oobabooga_windows\text-generation-webui\modules\monkey_patch_gptq_lora.py", line 24, in load_model_llama model, tokenizer = load_llama_model_4bit_low_ram(config_path, model_path, groupsize=shared.args.groupsize, is_v1_model=False) File "A:\LLMs_LOCAL\oobabooga_windows\text-generation-webui\repositories\alpaca_lora_4bit\autograd_4bit.py", line 204, in load_llama_model_4bit_low_ram model = accelerate.load_checkpoint_and_dispatch( File "A:\LLMs_LOCAL\oobabooga_windows\installer_files\env\lib\site-packages\accelerate\big_modeling.py", line 479, in load_checkpoint_and_dispatch load_checkpoint_in_model( File "A:\LLMs_LOCAL\oobabooga_windows\installer_files\env\lib\site-packages\accelerate\utils\modeling.py", line 946, in load_checkpoint_in_model set_module_tensor_to_device(model, param_name, param_device, value=param, dtype=dtype) File "A:\LLMs_LOCAL\oobabooga_windows\installer_files\env\lib\site-packages\accelerate\utils\modeling.py", line 135, in set_module_tensor_to_device if old_value.device == torch.device("meta") and device not in ["meta", torch.device("meta")] and value is None: AttributeError: 'NoneType' object has no attribute 'device'

Is there an existing issue for this?

  • [X] I have searched the existing issues

Reproduction

cargar monkeypatch en la ui

Screenshot

Captura

Logs

ERROR:
Load Model ...

System Info

rtx 3060 (12GB) i7 10700 32 ram

ReDXeoL avatar May 12 '23 14:05 ReDXeoL

GPT-J needs a different layer not split "GPTJBlock" and you are using accelerate.. Also it requires deleting embed_out instead of lm_head. I'm not sure that normal monkeypatch code is set to handle it. I will see if I can train GPT-J today because it loads on my setup for sure but I only tried inference.

Ph0rk0z avatar May 12 '23 15:05 Ph0rk0z

Thank you very much for your answer. Sorry, I'm new to Lora, it's the first time I've used the monkeypatch. I leave you the code that I use to quantify it to 4 bits: https://huggingface.co/TheBloke/GPT4All-13B-snoozy-GGML/discussions/1#64557c12f61f10d69dd10e72

ReDXeoL avatar May 12 '23 16:05 ReDXeoL

I was able to train the roleplay lora into pygmalion 6b 4bit using my fork

INFO:Getting model ready...                                                                                                        
INFO:Prepping for training...
INFO:Creating LoRA model...
INFO:Starting training...
{'loss': 12.5737, 'learning_rate': 0.0002926829268292683, 'epoch': 0.33}
{'loss': 8.5515, 'learning_rate': 0.0002560975609756097, 'epoch': 0.67}
{'loss': 7.5768, 'learning_rate': 0.0002195121951219512, 'epoch': 1.0}
{'loss': 6.9769, 'learning_rate': 0.00018292682926829266, 'epoch': 1.33}
{'loss': 6.6842, 'learning_rate': 0.00014634146341463414, 'epoch': 1.66}
{'loss': 6.3925, 'learning_rate': 0.0001097560975609756, 'epoch': 2.0}
{'loss': 6.041, 'learning_rate': 7.317073170731707e-05, 'epoch': 2.33}
{'loss': 5.6818, 'learning_rate': 3.6585365853658535e-05, 'epoch': 2.66}
{'loss': 5.4639, 'learning_rate': 0.0, 'epoch': 2.99}
{'train_runtime': 960.7748, 'train_samples_per_second': 6.005, 'train_steps_per_second': 0.047, 'train_loss': 7.326934729682074, 'epoch': 2.99}
INFO:LoRA training run is completed and saved.
INFO:Training complete!

As you see.. training GPT-J doesn't look all that great. I think neither the official repo or the pip package PR has this enabled. https://github.com/oobabooga/text-generation-webui/pull/1333/files

Ph0rk0z avatar May 13 '23 12:05 Ph0rk0z

you are amazing.... In conclusion, for now it is not recommended to train gpt-j at 4bits?

ReDXeoL avatar May 13 '23 14:05 ReDXeoL

If you are handy with how to setup stuff you can try it.. https://github.com/Ph0rk0z/text-generation-webui-testing

Every time I used GPT-J in 4bits, the scores are really not that great. That same loss for llama is like at 1.9. Maybe it would be better to train for 6 epochs? I'm still rather new to training stuff like you but from what I read it looks kinda meh

Ph0rk0z avatar May 13 '23 15:05 Ph0rk0z

download text-generation-webui-testing, in addition to update the modules that you corrected, but for some reason when starting webiu I get this error, do you know why? Has it ever happened to you, is not replacing the files the only thing I have to do? Sorry for the inconvenience but I would like you to help me... 239422275-3966a7ad-7e53-4236-b434-22e8e955fae8

even though it exists, it cannot find the modules folder or its files- Sorry, I'm new, so I may be making some silly mistake and I don't realize it.

ReDXeoL avatar May 19 '23 03:05 ReDXeoL

Did you install GPTQ merged and all that fun stuff? And recompile it in windows?

Ph0rk0z avatar May 19 '23 11:05 Ph0rk0z

you are a genius, thank you very much it works now!!

ReDXeoL avatar May 19 '23 13:05 ReDXeoL

I wish I was more help on windows but I honestly don't have a system with both windows and a GPU that can run anything decent.

Ph0rk0z avatar May 19 '23 18:05 Ph0rk0z

It seems that I still don't have complete luck and I already managed to start it but now I have a new error Captura

Traceback (most recent call last): File “A:\LLMs_LOCAL\oobabooga_windows\text-generation-webui\modules[training.py](http://training.py/)”, line 330, in do_train lora_model = get_peft_model(shared.model, config) File “A:\LLMs_LOCAL\oobabooga_windows\installer_files\env\lib\site-packages\peft[mapping.py](http://mapping.py/)”, line 120, in get_peft_model return MODEL_TYPE_TO_PEFT_MODEL_MAPPING[peft_config.task_type](model, peft_config) File “A:\LLMs_LOCAL\oobabooga_windows\installer_files\env\lib\site-packages\peft\peft_model.py”, line 670, in init super().init(model, peft_config, adapter_name) File “A:\LLMs_LOCAL\oobabooga_windows\installer_files\env\lib\site-packages\peft\peft_model.py”, line 99, in init self.base_model = PEFT_TYPE_TO_MODEL_MAPPING[peft_config.peft_type]( File “A:\LLMs_LOCAL\oobabooga_windows\installer_files\env\lib\site-packages\peft\tuners[lora.py](http://lora.py/)”, line 154, in init self.add_adapter(adapter_name, self.peft_config[adapter_name]) File “A:\LLMs_LOCAL\oobabooga_windows\installer_files\env\lib\site-packages\peft\tuners[lora.py](http://lora.py/)”, line 161, in add_adapter self._find_and_replace(adapter_name) File “A:\LLMs_LOCAL\oobabooga_windows\text-generation-webui\repositories\GPTQ-Merged\src\alpaca_lora_4bit\monkeypatch\peft_tuners_lora_monkey_patch.py”, line 158, in _find_and_replace raise ValueError( ValueError: Target module QuantLinear() is not supported. Currently, only torch.nn.Linear and Conv1D are supported.

These lines had comments and that's why it didn't work for me, it asked me for the monkey patch

#For GPTQ and autograd sys.path.insert(0, str(Path("repositories/GPTQ-Merged/src/alpaca_lora_4bit"))) sys.path.insert(0, str(Path("repositories/GPTQ-Merged/src/gptq_llama")))

ReDXeoL avatar May 19 '23 18:05 ReDXeoL

You have to save a clone of GPTQ merged under repositories. It's how they are imported. So on linux both clone and install those under repositories. They are set as submodules so you can in theory git submodule update --recursive --remote

and in case something is changed with PEFT this is the version that is in requirements.txt for the 4bit lora repo.. I have not updated it in a little bit: git+https://github.com/huggingface/peft.git@70af02a2bca5a63921790036b2c9430edf4037e2

Ph0rk0z avatar May 19 '23 19:05 Ph0rk0z

It was the first time I did it, so I didn't know how to install it, but thanks to you and gpt4... I finally managed to install it correctly, sorry for my inexperience. in windows 10 it is like this: git clone https://github.com/Ph0rk0z/GPTQ-Merged.git cd GPTQ-Merged git checkout dual-model git submodule update --init --recursive

ReDXeoL avatar May 19 '23 20:05 ReDXeoL

That's good that it's working.

Ph0rk0z avatar May 19 '23 21:05 Ph0rk0z

I am running 8-bits and have the same error. A monkey-patch is not required for 8-bits am I right ?

Please advise, thank you. Steve

thusinh1969 avatar May 26 '23 06:05 thusinh1969

yea, 8bits doesn't need monkeypatch but the model must be a full model and not GPTQ

Ph0rk0z avatar May 26 '23 11:05 Ph0rk0z

This issue has been closed due to inactivity for 30 days. If you believe it is still relevant, please leave a comment below.

github-actions[bot] avatar Aug 13 '23 23:08 github-actions[bot]