candle icon indicating copy to clipboard operation
candle copied to clipboard

Codegemma-7b-instruct failure on Metal

Open niklasha opened this issue 1 year ago • 4 comments
trafficstars

cargo run --features metal --example gemma -- --which code-7b-it --prompt "explain isakmpd's architecture" fails with:

retrieved the files in 27.197292ms
loaded the model in 36.859128625s
explain isakmpd's architectureError: Metal error Invalid matmul arguments [1296, 81, 9, 1] [36864, 256, 4096, 1] (9, 256, 9)

The prompt is not of great importance, other prompts just give different strides, but fails equally. I did look into this a bit, but I confess it sort of goes over my current competence. I thought the stride vector always should be decreasing, but the rhs stride info is, as can be seen [36864, 256, 4096, 1], which does not fit into my mental model. However the running with "--cpu" does accept this. I am still sceptic it does the math correctly, since it too seems to get the same striding, but it may be I that misunderstand the concept.

niklasha avatar Apr 19 '24 17:04 niklasha

Thanks for reporting this, I think it's an issue that only happens on the 7b because of MQA (which is not present on the 2b version which was used for testing), could you give a try to #2091 , hopefully this should provide the appropriate fix.

LaurentMazare avatar Apr 19 '24 17:04 LaurentMazare

I have tested, and it does not crash anymore, thanks, and the output matches "--cpu". However the quality of the response to the example prompt is pretty low, subjectively. But that is not the key issue here I guess :-)

niklasha avatar Apr 19 '24 19:04 niklasha

Glad that it helped. Did you make sure to respect the prompt format? This example is very barebone and doesn't do it for you. https://huggingface.co/blog/codegemma#prompt-format

LaurentMazare avatar Apr 19 '24 19:04 LaurentMazare

Aha! thanks, well I just was testing and did not do my homework. No I did not respect the prompt format :-)

niklasha avatar Apr 19 '24 19:04 niklasha