Thomas Capelle
Thomas Capelle
wandb is used in finetune.py, so why not add it here =)
I am running Llama/Mistral inference examples on my M1Pro with 16GB of memory and getting around 80sec/token. - Does the framework support FP16? - GPU usage seems low, do I...
- Adds notebooks - Adds data
Add pip install for colab
The idea is showing: - A progress bar with the actual total count - Having the same steps logged and reported on the progress bar - Count a training step...
Added the Grad norm function that original was added in the improved logging experience. - It may be moved somewhere else, but I think it's a really relevant metric to...
That's it, renaming the step counter to `global_step`. Why? - It makes more sense and reads better - It is named `global_step` on the HF Trainer, make using torchtune familiar...
Hello, I have been using the Zephry DPO recipe and the models I get are save in float32. I am using config_full and accelerate multi_gpu.yaml I think the issue is...
I was looking at the logs of your training (from this [json](https://huggingface.co/HuggingFaceH4/mistral-7b-sft-beta/resolve/main/trainer_state.json?download=true) file) and realized that the scheduling is messed up. It's related to the ConstantLength dataset, not computing its...
- Move image to same folder - add to sidebar