JD

Results 15 comments of JD

> Hey! Do you guys figure out a solution to this problem? Thanks! Unfortunately not yet, I spend a lot of time trying to figure out a way to do...

I am having the issue when I tried to use the model through huggingface like this: it gave me this long warning: `Some weights of the model checkpoint at voidism/diffcse-roberta-base-sts...

Hi, were you able to use Qlora with Flan-t5? I was trying to do it but I got this error ```python ValueError: Trying to set a tensor of shape torch.Size([4096,...

@danielhanchen When I set `8bit=True` in my code, ``` model, tokenizer = FastLanguageModel.from_pretrained( model_name = model_name, # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B load_in_8bit=True, load_in_4bit=False, ``` I encountered the following error: `...

thank you for the response Danial. But I'm still confused could you please explain why there is a difference between the padding side used during training ("right") versus inference ("left")?...

I am experiencing similar behaviour the training loss values show considerable fluctuations as you can see below here is my code is there something wrong with the training parameters that...

thank you for the prompt response here is the code and output ![Screenshot 2024-05-27 at 4 48 11 PM](https://github.com/huggingface/peft/assets/145554661/578d7270-43d5-45d8-8ece-d1ee9625982f) the Loras appear in the model when I print the model...

no i was not aware of that ("adding a fresh, untrained LoRA adapter") so now i changed the code based this information but i go the same behavior as you...

thank you for your prompt responses I tried what you mentioned here what I got the same behaviour ``` Base Model Output: Hello, how are you? I hope you’re having...

> > If I'm not mistaken the reason for the prefix is because most models don't interpret the initial token correctly, so this was used to pad it. The value...