Benjamin Bossan comments

Results 1181 comments of


                                            Benjamin Bossan

Cannot apply both PEFT QLoRA and DeepSpeed ZeRO3

@echo-yi Does it work for you with a smaller model, like the example from the PEFT docs? @matthewdouglas @Titus-von-Koeller Could you please take a look, could it be an issue...

Cannot apply both PEFT QLoRA and DeepSpeed ZeRO3

Thanks for testing those. Since this error occurs already at the stage of loading the base model, it is not directly a PEFT error, though of course PEFT is affected...

Cannot apply both PEFT QLoRA and DeepSpeed ZeRO3

> shared [this line](https://github.com/huggingface/transformers/blob/2a5a6ad18aa22e98429bb5ecb880660328030ea0/src/transformers/modeling_utils.py#L3796-L3800), indicating applying both quantization and DS ZeRO3 doesn't work Yeah, that was added in the PR I mentioned earlier. I can confirm that even for smaller...

Cannot apply both PEFT QLoRA and DeepSpeed ZeRO3

Also pinging @muellerzr in case he knows something about this.

When using accelerate+deepspeed to accelerate the code does not work

Thanks for reporting the error, but I have trouble understanding what you suggest. When you write > Let's comment out the following two lines in def `__main__` do you mean...

When using accelerate+deepspeed to accelerate the code does not work

So if I understand you correctly, for your use case, you don't want to use `HfArgumentParser` because all parameters are fixed. However, when you comment it out, you get an...

When using accelerate+deepspeed to accelerate the code does not work

Great, please report back when you have results.

Unaligned blit request with RoBERTa

I'm honestly at a loss here. This appears to be some implementation issue deep down in the PyTorch MPS code. As it says, this error should be reported to PyTorch....

After using peft_model.merge_and_unload(), the merged model is completely different from the inference result of peft_model.

Could you please provide more information (preferably in English) about whether you used quantization on the base model (e.g. bitsandbytes 4bits)?

After using peft_model.merge_and_unload(), the merged model is completely different from the inference result of peft_model.

Thanks for the info. I could reproduce a similar result to what you found. Here is a standalone reproducer: ```python import os import sys import torch from transformers import AutoTokenizer,...