Daniel Han
Daniel Han
Oh set `do_sample = False`
Yep the other option is to force-ably turn all random stuff off - ie https://pytorch.org/docs/stable/notes/randomness.html
Hmm ok so an EOS token is missing - I'll check this - thanks for the report
Yep working on a fix!
Apologies - hopefully it's fixed now!
There is a way to overwrite the code itself and allow input_embeds to be passed, but it'll be a bit of custom code - another way is to save the...
I'll see what I can do, but it'll be a bit tough to edit HF directly :(
Ok this is a weird one! Hmmm I'm assuming the quantization is not using double quantization nf4 but just a single quant?
Hmm unsure on the bad outputs - did you use `FastLanguageModel.for_inference(model)`
Oh we use BnB directly