Wing Lian
Wing Lian
are you thinking like `model_name@revision+lora_path`?
Can you verify that flash attention is installed?
This should be fixed in the latest main by #1345
@Jay-Nehra What dependency conflicts? Are you using deepspeed? if so what version? see https://github.com/OpenAccess-AI-Collective/axolotl/issues/1320#issuecomment-1962329372
@Jay-Nehra there is already a PR upstream to fix this https://github.com/huggingface/transformers/pull/29212
if you're using fp16, you'll likely have to change your learning rate way down. you're getting over/underflows of the fp16 values leading to 0 loss
@JohanWork I'm working on a slight refactor of doing validation with Pydantic, so let's fix and merge this after that gets merged. thanks!
@JohanWork the pydantic refactor has been merged. let me know if you have any questions about that.
I took a first pass at fixing the pylint errors, but mypy has still found a few issues ``` src/axolotl/custom_optim/sophia.py:233: error: "float" has no attribute "neg" [attr-defined] src/axolotl/custom_optim/lion.py:161: error: "None"...
I would recommend going to torch 2.1.2