crabml icon indicating copy to clipboard operation
crabml copied to clipboard

Add test for different quantization

Open Xuanwo opened this issue 1 year ago • 5 comments

We don't know if a quantization works for now, we should have a simple test for them.

cc @flaneur2020, any ideas? Maybe we can start with a simple cargo run?

Xuanwo avatar Mar 03 '24 13:03 Xuanwo

IMHO quantization can be tests by make a vec, quantize it and dequantize it back, then check the difference between the original one.

flaneur2020 avatar Mar 03 '24 14:03 flaneur2020

IMHO quantization can be tests by make a vec, quantize it and dequantize it back, then check the difference between the original one.

https://github.com/crabml/crabml/blob/071dea4b50b5fffe41a8c29bf681ae2a94ee01dc/crabml-core/src/backends/cpu/buf/buf_q4_1.rs#L175-L192

Something like this?

dqhl76 avatar Mar 03 '24 14:03 dqhl76

also, I suppose that we can consider add all the quantized models of tinyllama-15m into our testdata/ folder (this will make our repo larger, but i think it worths) (or we can let them downloaded on the fly in the ci script)

and use a table driven test manner to test them all

flaneur2020 avatar Mar 04 '24 03:03 flaneur2020

Seems a good idea. Let me do this.

Xuanwo avatar Mar 04 '24 04:03 Xuanwo

besides copying all the quantized models into the repo, i guess there's also another way: find the gguf converter in llama.cpp, and convert one f32 model file in the testdata/ into all the types in the ci stage.

flaneur2020 avatar Mar 11 '24 14:03 flaneur2020