Kartik Khare

Results 36 comments of Kartik Khare

Yes @taishan1994 is correct, we can't use older version of peft since it doesn't have the qlora changes.

No, it's a bug See https://github.com/artidoro/qlora/pull/44

if you have pytorch.bin files in the checkpoint dir then it won't but otherwise it might

No I meant it should still be slightly relatable and not filled with gibberish tokens.

> @KKcorps this happened when my learning rate was too high. I was able to solve this using a different base model. but if learning rate it too high then...

I should honestly refactor the variable names but for now it works. `resume_from_checkpoint` is probably not needed, just need to check if `checkpoint_dir` is used someplace else or we can...

@artidoro you are correct it doesn't reload the optimizer and scheduler states. Checking if hugging face allows that seperately, otherwise we might have to just load the full checkpoint. For...

You can load both adapter files using `torch.load`, print all the named parameters and take a diff to see what params are missing.

@Jackie-Jiang went with the first alternative