Alex "mcmonkey" Goodwin comments

Results 541 comments of


                                            Alex "mcmonkey" Goodwin

Lora Trainer Improvements Part 4: 4-Bit Support and more!

`Tried to load multiple loras and won't. Even on latest hijacked PEFT.` I mean the main thing on that is just getting the patched hacks pushed upstream so that we're...

Lora Trainer Improvements Part 4: 4-Bit Support and more!

`target_modules` is mostly just identifying where the LoRA connects to because it's unique based on model type. I don't _think_ there's any reason to muck with that (unless there's room...

Lora Trainer Improvements Part 4: 4-Bit Support and more!

Full model reload wasn't needed for 8-bit, but might be for 4-bit, or at least with the patches? If your version works better, should PR improvements.

Lora Trainer Improvements Part 4: 4-Bit Support and more!

@USBhost `For warm up steps that should only apply to constant with warm up scheduler. Iirc.` ![image](https://user-images.githubusercontent.com/4000772/233170841-6146d658-1f83-4f01-9bfe-d78caae23a6d.png) All schedulers other than `constant` support a warmup. (inverse_sqrt does too even though...

Lora Trainer Improvements Part 4: 4-Bit Support and more!

All above requests (unless I missed one) are now done, and updated in the OP. Open questions I had previously are addressed and handled now. Saving will happen in the...

Lora Trainer Improvements Part 4: 4-Bit Support and more!

Added an option to select the optimizer, `adamw_bnb_8bit` doesn't seem to actually reduce VRAM over the default `adamw_hf`. Might be valuable to do some research/testing to see if maybe one...

Lora Trainer Improvements Part 4: 4-Bit Support and more!

Oh, that's a great idea!

Added an audible notification after text generation in web.

Could you rebase/merge against the new main? There's been changes to the same sections of code which prevent it from merging. Also, perhaps add a comment in the docs somewhere...

Add 4-bit LoRA support

LLaMA-Precise preset is near-deterministic (different seeds rarely yield different outcomes), make sure to test with a different preset if you think this change is breaking seeding behavior.

How does one use this to train?

You can use either JSON or simple text files. There's a training tab in the webui and all the inputs are explained. If you use JSON datasets, you need to...