tiny-llama qlora example does not work ("please set from_tf=True")
Please check that this issue hasn't been reported before.
- [X] I searched previous Bug Reports didn't find any similar reports.
Expected Behavior
training proceeds with the tiny-llama qlora example
Current behaviour
Only changing batch size to 5 and the dataset path, I get the following error:
[ERROR] [axolotl.load_model:544] [PID:469] [RANK:0] Unable to load weights from pytorch checkpoint file for '/root/.cache/huggingface/hub/models--TinyLlama--TinyLlama-1.1B-intermediate-step-1431k-3T/snapshots/4b8dd7e43ec08c24ccaf89cbf67898cff53c95ae/pytorch_model.bin' at '/root/.cache/huggingface/hub/models--TinyLlama--TinyLlama-1.1B-intermediate-step-1431k-3T/snapshots/4b8dd7e43ec08c24ccaf89cbf67898cff53c95ae/pytorch_model.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.
I have tried multiple winglian/axolotl docker tags, dev-latest, main-latest and main-py3.10-cu121-2.1.1, and running it locally; all have this issue.
Steps to reproduce
- (most likely optional) change batch size to 5
- (most likely optional) set dataset to a completion formated json file path [{"text": ...}, "text": ...}]
- start training
- get error
Config yaml
https://github.com/OpenAccess-AI-Collective/axolotl/blob/44ba616da2e5007837361bd727d6ea1fe07b3a0e/examples/tiny-llama/qlora.yml
Possible solution
No response
Which Operating Systems are you using?
- [X] Linux
- [ ] macOS
- [X] Windows
Python Version
3.11.2
axolotl branch-commit
44ba616
Acknowledgements
- [X] My issue title is concise, descriptive, and in title casing.
- [X] I have searched the existing issues to make sure this bug has not been reported yet.
- [X] I am using the latest version of axolotl.
- [X] I have provided enough information for the maintainers to reproduce and diagnose the issue.
try with the base model as unsloth/tinyllama instead which has safetensors. I believe the original version is using tensorflow checkpoints that don't play nicely with HF.
I tested it and a few notes:
unsloth/tinyllamaworks- should qlora.yml example point to
unsloth/tinyllamainstead? - https://github.com/hiyouga/LLaMA-Factory works with the non unsloth variant too, what are they doing different than axolotl and is it worth merging in? e.g. are they pre-converting to safetensors?
In llama factory, are you loading the original variant without any unsloth optimizations?
yes, just using this one as model input: https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T