I tried madlad400, but there is a problem with the output if it is float16
Hi.
I tried madlad400, but there is a problem with the output if it is float16
$ python convert.py --model google/madlad400-3b-mt
$ python t5.py --model google/madlad400-3b-mt --prompt "<2ja>A tasty apple"
[INFO] Generating with T5...
Input: <2ja>A tasty apple
リンゴの味
Time: 20.28 seconds, tokens/s: 0.30
$ python convert.py --model google/madlad400-3b-mt --dtype float16
$ python t5.py --model google/madlad400-3b-mt --dtype float16 --prompt "<2ja>A tasty apple"
[INFO] Generating with T5...
Input: <2ja>A tasty apple
<unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk><unk>
Time: 25.66 seconds, tokens/s: 3.90
Thank.
Indeed.. the T5 models typically don't work well in fp16. Probably they need some kind of activation clipping or rescaling to fix this. mx.bfloat16 should work though.
Thank. It worked fine.
$ python convert.py --model google/madlad400-3b-mt --dtype float16
$ python t5.py --model google/madlad400-3b-mt --dtype bfloat16 --prompt "<2ja>A tasty apple"
[INFO] Generating with T5...
Input: <2ja>A tasty apple
リンゴの味
Time: 18.48 seconds, tokens/s: 0.32
My machine has low memory so it's swapping so it's slow. hahaha.
My hope is that the file size is still large, so it would be nice if it could be used in int8 as well.
$ ls -lah google-madlad400-3b-mt.npz
6.5G google-madlad400-3b-mt.npz
Thank.
Do you mind uploading your madlad-400 mlx to HF?
Please don't worry about it
No, I mean I want to use it and easier if it's on HF.