luminal Phi model does not produce output on M3

Phi model does not produce output on M3

Open jorgeantonio21 opened this issue 1 year ago • 4 comments

Currently, I can't extract an output by running the phi3 example:

 % cargo run --release --features metal

\    Finished release [optimized] target(s) in 0.27s
     Running `/Users/jorgeantonio/dev/luminal/target/release/phi`
Defining graph           - 75ms
Compiling graph          - 4799ms
Loading model            - 3544ms
Processing Prompt        - 183ms (71.04 tok/s, 13 prompt tokens)
<|user|>
Please write me a python implementation of merge sort<|end|>
<|assistant|>


Average token generated in 46.66ms       - (21.43 tok/s)

May 04 '24 08:05 jorgeantonio21

This issue is related to #51

May 04 '24 08:05 jorgeantonio21

Does this still happen if you pull main branch? I believe for others this has been fixed. It may be the same issue with M3 that llama is facing

May 05 '24 13:05 jafioti

I'm fairly certian the problem is the softmax kernel producing inf on your machine, which makes the logits come out NaN, and triggers the blank token to be outputted, which is why you see no output at all. I will be revisiting the softmax kernel today or tomorrow to fix this

May 05 '24 14:05 jafioti

I pulled the main branch right now, and the problem persists.

Thank you so much @jafioti !

May 06 '24 10:05 jorgeantonio21

yes comment SoftmaxCompiler in luminal_metal lib.rs and Phi (and Llama) example will work on M3

Jun 04 '24 23:06 mikeseven

@mikeseven Does it give proper outputs? In the other issue you mentioned it gives bad outputs

Jun 05 '24 15:06 jafioti

Sorry for the confusion. I wanted to say that the output looks correct but not as good as with llama. It looks to me a model accuracy issue.

Jun 06 '24 17:06 mikeseven

Ok I'll close this for now then, thanks

Jun 07 '24 03:06 jafioti

luminal luminal copied to clipboard

Phi model does not produce output on M3

luminal
luminal copied to clipboard