anzz1
anzz1
Some checksums (q4_0 and gptq-4b quantizations, new tokenizer format) [ggml-q4-checksums.zip](https://github.com/ggerganov/llama.cpp/files/11053116/ggml-q4-checksums.zip) e: added more checksums
@Green-Sky Yeah there is only one, i might be thinking ahead too much. :smile: also added some more checksums for gptq-4b models above https://github.com/ggerganov/llama.cpp/issues/374#issuecomment-1480719278
yes it might be good to differentiate ones as some have short fur and some long and some are more friendly than others. but llamas will always be the llamas...
Are your model files correct? sha256sum of `ggml-alpaca-7b-q4.bin` should be ~~`8d5562ec1d8a7cfdcf8985a9ddf353339d942c7cf52855a92c9ff59f03b541bc`~~ e: this was probably wrong (some of?) the hashes currently found in the SHA256SUMS file are wrong https://github.com/ggerganov/llama.cpp/issues/374
It seems I'm too tired to find the button for converting to a draft, but anyway. The _cvtss_sh and _cvtss_ss intrinsics are still missing and not implemented yet, so don't...
I haven't checked out the compiled output at the disassembly level yet, so especially in the case of F16C there is the consideration as to which extent the compiler had...
> Avx2/avx512 also implies all the simd instructions being enabled like sse3 Yeah, but the `__SSE3__` wasn't currently used as `__AVX__` takes precedence over it, so I didn't add it...
> does that macro even exist? https://learn.microsoft.com/en-us/cpp/preprocessor/predefined-macros?view=msvc-160 It doesn't, that is the entire point.
> Do you observe improved performance with this change? I'll have to take a in-depth look later analysing the binary code and timing the performance, until then no idea. In...
Huge thanks @nicknitewolf and @KASR providing some statistics. :+1: :partying_face: I've concluded that unfortunately as my CPU is dog and only has 4 threads total, I can't provide useful statistics...