CyberNative AI

Results 5 comments of CyberNative AI

Same issue with codestral on exactly the middle of epoch ``` {'loss': 0.6265, 'grad_norm': 8.632710456848145, 'learning_rate': 0.00012, 'epoch': 0.48} 25%|███████████ | 6/24 [06:05

Last update: idk what happened but it is working now! Thanks @winglian, it might be env you suggested + disabling sample_packing/pad_to_sequence_len! @winglian thanks for your reply, it seems to crash...

I was able to train model with @winglian suggested env: ``` CUDA_LAUNCH_BLOCKING=1 ``` And the following config: ``` base_model: mistralai/Codestral-22B-v0.1 model_type: AutoModelForCausalLM tokenizer_type: AutoTokenizer trust_remote_code: true load_in_8bit: true load_in_4bit: false...

Can you try like this? ``` datasets: - path: teknium/OpenHermes-2.5 conversation: chatml type: sharegpt ```

yup, same for chatml: `chat_template: chatml`