axolotl
axolotl copied to clipboard
Adding Phi-3 model
Adding Phi-3 model
fsdp_transformer_layer_cls_to_wrap: Phi3DecoderLayer
Will you also add support for Phi-3 chat template for fine-tuning? As a reference: https://github.com/unslothai/unsloth/blob/4211cc01409e3ced4f7abebaf68e244193b46e2c/unsloth/chat_templates.py#L269C3-L269C8
will merge this after #1582
Was wondering when would this be merged?
Was wondering when would this be merged?
Too busy to work on this :P
@winglian I am having some issues with phi-3-small and phi-3-medium models.
- phi-3-medium with 4k instruct, it constantly fails with
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method - phi-3-small with 8k instruct, fails with:
ValueError: For now, we do not support unknown special tokens
In the future, if there is a need for this, we can add special tokens to the tokenizer
starting from rank 100261 - 100263 and then 100266 - 100275.
And finally, we can re-construct the enc object back
Is it correct to assume this PR won't solve these, right?
@maziyarpanahi yeah, I don't think this PR will resolve that issue.
thanks @monk1337 !
Thank you @winglian and @monk1337