Jioh L. Jung
Jioh L. Jung
https://github.com/ml-explore/mlx-examples/pull/645 I open full fine-tune code. :) I tested on my M2 Mac Studio 192GB, phi 2 (2.8B). and It really works well.
Fixed load/save function. fully tested with Phi-2 2.8B model. - model file saving works well and resuming of model training works well also. > Do you perform your full fine-tune...
test example. - M2 studio / 192GB - Model: phi-2 2.8B - Training set: chat completion / 400 items. (only iter set to 10, to show running demo) ``` $...
> Tried training qwen-1.8b. NaN loss immediately. Will try phi-2. when I tried Gemma-2b, same NaN loss. maybe, it's foundation code issue. maybe in models/* ? I didn't check.