Daniel Han
Daniel Han
@Erland366 Could you check if vLLM works still if no LoRA adapters are added? I think you also had a PR on moving `load_lora` outside of `get_peft_model`
@wdlctc Thanks a lot again!! I'll test it and verify all losses match! Appreciate it!
Sorry on the delay - was planning to add this together with Vision support :) It might take a few more days!
Oh lol I noticed I accidentally deleted this PR after I deleted the nightly branch - whoops so sorry!
Interesting so I looked through the paper and code, essentially you're proposing to essentially do gradient accumulation inside of each sequence length? Ie the first is normally chunking the CE...
@Erland366 Could you confirm if my latest fix allows multi GPU to work OK? Thanks. I think it's CCE related but unsure
Do you mean the logging that I provided? Or as in you want to use a reward model? GRPO in general doesn't use a reward model - it calculates advantages...
Yes working on it!
Hopefully out soon.
@nottrz @DiTo97 @Summer142857 @Summer142857 Apologies just fixed! For Colab / Kaggle, please restart and run all. For local machines, please do: ``` pip install --force-reinstall --upgrade --no-cache-dir --no-deps unsloth unsloth_zoo...