axolotl tiny-llama qlora example does not work ("please set from

Please check that this issue hasn't been reported before.

[X] I searched previous Bug Reports didn't find any similar reports.

Expected Behavior

training proceeds with the tiny-llama qlora example

Current behaviour

Only changing batch size to 5 and the dataset path, I get the following error:

[ERROR] [axolotl.load_model:544] [PID:469] [RANK:0] Unable to load weights from pytorch checkpoint file for '/root/.cache/huggingface/hub/models--TinyLlama--TinyLlama-1.1B-intermediate-step-1431k-3T/snapshots/4b8dd7e43ec08c24ccaf89cbf67898cff53c95ae/pytorch_model.bin' at '/root/.cache/huggingface/hub/models--TinyLlama--TinyLlama-1.1B-intermediate-step-1431k-3T/snapshots/4b8dd7e43ec08c24ccaf89cbf67898cff53c95ae/pytorch_model.bin'. If you tried to load a PyTorch model from a TF 2.0 checkpoint, please set from_tf=True.

I have tried multiple winglian/axolotl docker tags, dev-latest, main-latest and main-py3.10-cu121-2.1.1, and running it locally; all have this issue.

Steps to reproduce

(most likely optional) change batch size to 5
(most likely optional) set dataset to a completion formated json file path [{"text": ...}, "text": ...}]
start training
get error

Config yaml

https://github.com/OpenAccess-AI-Collective/axolotl/blob/44ba616da2e5007837361bd727d6ea1fe07b3a0e/examples/tiny-llama/qlora.yml

Possible solution

No response

Which Operating Systems are you using?

[X] Linux
[ ] macOS
[X] Windows

Python Version

3.11.2

axolotl branch-commit

44ba616

Acknowledgements

[X] My issue title is concise, descriptive, and in title casing.
[X] I have searched the existing issues to make sure this bug has not been reported yet.
[X] I am using the latest version of axolotl.
[X] I have provided enough information for the maintainers to reproduce and diagnose the issue.

Jan 11 '24 23:01 lucyknada

try with the base model as unsloth/tinyllama instead which has safetensors. I believe the original version is using tensorflow checkpoints that don't play nicely with HF.

Jan 13 '24 05:01 winglian

I tested it and a few notes:

unsloth/tinyllama works
should qlora.yml example point to unsloth/tinyllama instead?
https://github.com/hiyouga/LLaMA-Factory works with the non unsloth variant too, what are they doing different than axolotl and is it worth merging in? e.g. are they pre-converting to safetensors?

Jan 13 '24 17:01 lucyknada

In llama factory, are you loading the original variant without any unsloth optimizations?

Jan 13 '24 19:01 winglian

yes, just using this one as model input: https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T

Jan 13 '24 19:01 lucyknada

tiny-llama qlora example does not work ("please set from_tf=True")

Please check that this issue hasn't been reported before.

Expected Behavior

Current behaviour

Steps to reproduce

Config yaml

Possible solution

Which Operating Systems are you using?

Python Version

axolotl branch-commit

Acknowledgements