Daniel Han
Daniel Han
@its5Q That's very weird :( For me it seems to work perfectly. I have an example if you can run this: ```python from unsloth import FastLanguageModel model, tokenizer = FastLanguageModel.from_pretrained(...
Also @its5Q you need to use padding_side = "left" or else the results will be wrong
@its5Q im thinking if somehow I can default it to left, since people have said this was an ongoing issue!
@JIBSIL Oh if you select `do_sample = False` there is no randomness involved. On the `left` issue - the issue is for training, this makes training more complex, and Unsloth...
@its5Q Whoops you're correct! I decided to just run the notebook - I 100% finally fixed it now oh lord so sorry!!! The issue of multiple model supports :( 
@armsp LoRA and QLoRA for reward models, PPO, DPO etc are all supported - ie anything TRL does, we can do :) But it just needs to be LoRA /...
@armsp Sadly I don't - I have DPO, but the rest you'll have to read the TRL docs
Fantastic!
@armsp Oh no :( I'll check again and get back to you - sorry on the issue!
Extreme apologies been extremely busy on my end - so apologies again didn't have time to look at this :(