Daniel Han comments

Results 781 comments of


                                            Daniel Han

Is there anyway to pretrain using unsloth?

@shan23chen Yes LoRA and QLoRA @VishnuPJ Sorry on the delay! Ye continued pretraining as mentioned by @erwe324 is a better solution - ie take some Wikipedia data in the specific...

Is there anyway to pretrain using unsloth?

@VishnuPJ Oh so use ```python from unsloth import add_new_tokens add_new_tokens(model, tokenizer, new_tokens = ["NEW_TOKEN", "NEW_TOKEN_2"] # Then add get_peft_model ```

Is there anyway to pretrain using unsloth?

@liwd190019 `UnslothTrainer` can allow you to set 2 learning rates - one for the lm_head / embed_tokens, and another for the LoRA adapters - we talk about that here: https://unsloth.ai/blog/contpretraining

TemplateError: Conversation roles must alternate user/assistant/user/assistant/...

@ahmadmustafaanis As the error suggests your dataset might be doing: ``` user: ... assistant: ... assistant: ... user: ... ``` It must be alternating

TemplateError: Conversation roles must alternate user/assistant/user/assistant/...

Apologies on the delay - just relocated to SF hence the slowness! You could try editting the chat template itself and removing `raise_error...`

AMD Support!

Hmm it depends on Bitsandbytes

Issue adding norm layers to save_modules

@RonanKMcGovern Sadly the derivatives are a nightmare for layernorms. Also training them isn't generally advised since you might overfit the weight updates to the norms, so I normally don't advise...

"Too many values to unpack error" when trying to load CodeLLaMa 7B or 34B on Colab Notebook

@fazeelzafar Thanks for the report - will fix this!

"Too many values to unpack error" when trying to load CodeLLaMa 7B or 34B on Colab Notebook

@arnavgarg1 Actually can reproduce - I also just downloaded the original codellama-7b tokenizer without Unsloth, but just using my `assert_same_tokenization`, and it still breaks - it it seems like an...

"Too many values to unpack error" when trying to load CodeLLaMa 7B or 34B on Colab Notebook

@arnavgarg1 @fazeelzafar I think I fixed it! I essentially just skip my checks for CodeLlama type models - dumb quick fix, but it seems like its just a Codellama thing....