Stephan Walter
Stephan Walter
Can you add `--ignore-missing` to the `sha256sum` command line in the readme? Many people won't have all files, the output is less noisy that way. (This is for Linux, don't...
> keep those int8 multiplications. Is there any processor people might reasonable use where that makes a difference? I'll change it if yes, otherwise it's more concise as it is.
You might remove `test-quantize.c`, that was my rather lazy attempt at a unit test.
I originally wrote the code parsing `/proc/cpuinfo` without having access to a wide variety of machines. It's good that you make the effort to improve this. How do the various...
> Should we change all the logic to be in the header? This is better kept in common.cpp. Maybe initialize the field to 0 or -1. Then move your code...
I don't think the idea was to change `QK` for an existing type, rather to add e.g. `Q4_2` which will have its own `QK42 != 32`
> Wondering if `QK4_0`, `QK4_1` and `QK8_0` wouldn't be better Sure, let's do that. That way we can also support >10 variants without confusion ;-) (at least in the numbering,...
On my Core i3-8100 (AVX2): ``` $ ./vdot 100 = -74.2272, -73.9193 time = 128.407 +/- 4.38483 us. maxt = 150.281 us timeq = 106.679 +/- 4.16175 us. maxt =...
Yes, `examples` is maybe not a great name, but it already contains various bits and pieces like your program.
I would have hoped the new format would be defined like this: ``` #define QK4_2 16 typedef struct { ggml_fp16_t d; // delta uint8_t qs[QK4_2 / 2]; // nibbles /...