Tuan Pham

Results 8 comments of Tuan Pham

Seem to be related to bitsandbytes, turn off load_in_4bit or load_in_8bit and seem to be working correctly

> I found the related bug #1909 and the solution there: [#1909 (comment)](https://github.com/microsoft/DeepSpeed/issues/1909#issuecomment-1225113348) > > Basically: > > ``` > rm deepspeed/ops/{csrc,op_builder} > rm deepspeed/accelerator > cp -R csrc op_builder...

So does that mean that if i want to eval every epoch, i would have to merge the lora adapter and then run the model.generate at every epoch?

> > So does that mean that if i want to eval every epoch, i would have to merge the lora adapter and then run the model.generate at every epoch?...

Friendly cc @casperdcl! I try to hotfix by force-installing ```4.66.1```, and it work for another 4 months before appearing again. ``` File /opt/conda/lib/python3.10/site-packages/tqdm/notebook.py:156, in tqdm_notebook.display(self, msg, pos, close, bar_style, check_delay)...

Hey @danielhanchen, just to let you know the bug is gone in transformers==4.41.2. Might help narrow down the bug as i saw push relating to cache in [4.42.1](https://github.com/huggingface/transformers/releases/tag/v4.42.1)

@thusinh1969 should be, you can lower the alpha of the instruction adapter if you found the adapter to be too overpowered. Or just target the attention layer and leave out...

Hi @mfazrinizar, i might not be able to review this until this weekend FYI. Very much appreciate the contribution!