JD comments

Results 15 comments of

JD

unfreeze_layer_from_past parameter

> Hey! Do you guys figure out a solution to this problem? Thanks! Unfortunately not yet, I spend a lot of time trying to figure out a way to do...

How to install?

I am having the issue when I tried to use the model through huggingface like this: it gave me this long warning: `Some weights of the model checkpoint at voidism/diffcse-roberta-base-sts...

Fine tune flan t5 xl and flan t5 xxl using qlora and has a problem with learning rate 0.0 and loss 0.0 ?

Hi, were you able to use Qlora with Flan-t5? I was trying to do it but I got this error ```python ValueError: Trying to set a tensor of shape torch.Size([4096,...

Adding support 8bit quantization.

@danielhanchen When I set `8bit=True` in my code, ``` model, tokenizer = FastLanguageModel.from_pretrained( model_name = model_name, # Choose ANY! eg teknium/OpenHermes-2.5-Mistral-7B load_in_8bit=True, load_in_4bit=False, ``` I encountered the following error: `...

Inconsistent Tokenizer Padding Behavior in Unsloth

thank you for the response Danial. But I'm still confused could you please explain why there is a difference between the padding side used during training ("right") versus inference ("left")?...

Weird DPO loss

I am experiencing similar behaviour the training loss values show considerable fluctuations as you can see below here is my code is there something wrong with the training parameters that...

Issues when switching between multiple adapters LoRAs

thank you for the prompt response here is the code and output ![Screenshot 2024-05-27 at 4 48 11 PM](https://github.com/huggingface/peft/assets/145554661/578d7270-43d5-45d8-8ece-d1ee9625982f) the Loras appear in the model when I print the model...

Issues when switching between multiple adapters LoRAs

no i was not aware of that ("adding a fresh, untrained LoRA adapter") so now i changed the code based this information but i go the same behavior as you...

Issues when switching between multiple adapters LoRAs

thank you for your prompt responses I tried what you mentioned here what I got the same behaviour ``` Base Model Output: Hello, how are you? I hope you’re having...

Bug: Phi-3 Tokenizer Adds Whitespaces on re-tokenization (which invalidates KV-cache)

> > If I'm not mistaken the reason for the prefix is because most models don't interpret the initial token correctly, so this was used to pad it. The value...