Daniel Han

Results 983 comments of Daniel Han

I'm working on a new method which will make this better!

@brando90 The issue is sometimes merging to 16bit from upcasting can cause accuracy issues

Oh yes!! Actually would you be interested in opening a PR and editing the Readme file, and then I'll copy paste your edits to the Wiki :)

TRL's Data Collator does not work on multiple conversations, and only works on 1 conversation.

Oh it's because the base model has untrained tokens - see https://unsloth.ai/blog/phi3 (Phi-3 blog has Llama-3 fixes). We identified this issue about using the Llama-3 chat template for the base...

Diffusion / Llava type models are next on our roadmap!

@al-swaiti Yep working on them!

Oh did you add new tokens?

Yes! Simply change the model name - we'll error out if the model does not work

Ok that's weird its not merging correctly? I'll check