useful-transformers
useful-transformers copied to clipboard
Providing help and FLOSS stack
Hello,
Your project looks cool, as I was rather sad seeing that Rockchip's NN framework failed to load any useful model.
I've done some reverse engineering ( + reading the datasheet) of RK3588's NPU ( https://github.com/phhusson/rknpu-reverse-engineering/), and I think that maybe I can help.
Reading your TODO, you're using RKNN exclusively to do matrix (not higher order tensors?) multiplications, is that intended? (NPU can do RELu, max/min/average pooling, convolutions)
I see you're waiting for rockchip for int4 matmul, hoping there is no hardware bug preventing it, I should be able to provide one if that's the most useful thing you need?
Either way, seeing your usage I'll try to write a FLOSS reimplementation of rockchip's matmul, to get rid of that proprietary blob.
@phhusson This library is indeed using the proprietary binary blob to perform the matrix multiplications. It is unfortunate that Rockchip is keeping the NPU fully closed. For the transformer models, 8-bit and/or 4-bit matrix multiplication are really all we need. Currently, only FP16 matrix multiplies are being used, but I didn't see that much performance improvement for the tiny.en model when using int8. Reverse engineering just the matrix multiplies would be quite useful for the community in general.