Daniel Han
Daniel Han
@patrickjchen So ur using the Kaggle notebook here https://www.kaggle.com/code/danielhanchen/kaggle-gemma-7b-unsloth-notebook/ right? I'm uncertain on internet connections and stuff sadly - not a Kaggle expert :(
@patrickjchen Ok Ill take a look
@alarecha24 That's a weird error msg - is this for Gemma? What's your GPU?
@nick-gt Are you using `load_in_4bit = True`?
@jgarcia2809 I just reuploaded Codellama-13b - hopefully it works now
Ok that's very weird - I'll see what I can do - temporarily best to use `unsloth/llama-3.1-8b` - another approach is to uninstall unsloth then reinstall it
@kiddyboots216 Are you using `use_bf16 = True` or `use_fp16 = True` in the Trainer?
@kiddyboots216 Oh wait use `FastLanguageModel` Also you can copy paste our COlab notebook if that works https://colab.research.google.com/drive/1Dyauq4kTZoLewQ1cApceUQVNcnnNTzg_?usp=sharing
For a full example: ```python from unsloth import FastLanguageModel import torch from trl import SFTTrainer from transformers import TrainingArguments from datasets import load_dataset max_seq_length = 2048 # Supports RoPE Scaling...
Oh thats actually upcasting!! So A and B were incorrect in float16, causing incorrect training runs