Daniel Han

San Francisco Unsloth - 2x faster 70% less VRAM finetuning Llama-3.1, Mistral, Gemma-2, Phi-3

Results 983 comments of


                                            Daniel Han

Random Training

Oh set `do_sample = False`

Random Training

Yep the other option is to force-ably turn all random stuff off - ie https://pytorch.org/docs/stable/notes/randomness.html

KeyError: 'EOS_TOKEN' when exporting GGUF with certain templates

Hmm ok so an EOS token is missing - I'll check this - thanks for the report

KeyError: 'EOS_TOKEN' when exporting GGUF with certain templates

Yep working on a fix!

KeyError: 'EOS_TOKEN' when exporting GGUF with certain templates

Apologies - hopefully it's fixed now!

Add support for passing in `inputs_embeds` into `generate` function

There is a way to overwrite the code itself and allow input_embeds to be passed, but it'll be a bit of custom code - another way is to save the...

Add support for passing in `inputs_embeds` into `generate` function

I'll see what I can do, but it'll be a bit tough to edit HF directly :(

trainer.train() AttributeError: 'NoneType' object has no attribute 'absmax'

Ok this is a weird one! Hmmm I'm assuming the quantization is not using double quantization nf4 but just a single quant?

trainer.train() AttributeError: 'NoneType' object has no attribute 'absmax'

Hmm unsure on the bad outputs - did you use `FastLanguageModel.for_inference(model)`

trainer.train() AttributeError: 'NoneType' object has no attribute 'absmax'

Oh we use BnB directly

‹
1
2
...
90
91
92
93
94
95
96
97
98
99
›