rknpu2 icon indicating copy to clipboard operation
rknpu2 copied to clipboard

4Bit Tensor Support

Open jimtendo opened this issue 2 years ago • 0 comments

Looking at the tensor types, there currently doesn't look to be 4 bit support:

https://github.com/rockchip-linux/rknpu2/blob/master/runtime/RK3588/Linux/librknn_api/include/rknn_api.h#L127

Given that 4bit quantization can work quite well for LLM's (see llama.cpp), is support for this possible on an RK3588? It might make these units appealing platforms to run a dedicated LLM assistant on.

jimtendo avatar May 11 '23 00:05 jimtendo