Add test for different quantization
We don't know if a quantization works for now, we should have a simple test for them.
cc @flaneur2020, any ideas? Maybe we can start with a simple cargo run?
IMHO quantization can be tests by make a vec, quantize it and dequantize it back, then check the difference between the original one.
IMHO quantization can be tests by make a vec, quantize it and dequantize it back, then check the difference between the original one.
https://github.com/crabml/crabml/blob/071dea4b50b5fffe41a8c29bf681ae2a94ee01dc/crabml-core/src/backends/cpu/buf/buf_q4_1.rs#L175-L192
Something like this?
also, I suppose that we can consider add all the quantized models of tinyllama-15m into our testdata/ folder (this will make our repo larger, but i think it worths) (or we can let them downloaded on the fly in the ci script)
and use a table driven test manner to test them all
Seems a good idea. Let me do this.
besides copying all the quantized models into the repo, i guess there's also another way: find the gguf converter in llama.cpp, and convert one f32 model file in the testdata/ into all the types in the ci stage.