anzz1 comments

Results 149 comments of


                                            anzz1

SHA256 checksums correctness

Some checksums (q4_0 and gptq-4b quantizations, new tokenizer format) [ggml-q4-checksums.zip](https://github.com/ggerganov/llama.cpp/files/11053116/ggml-q4-checksums.zip) e: added more checksums

@Green-Sky Yeah there is only one, i might be thinking ahead too much. :smile: also added some more checksums for gptq-4b models above https://github.com/ggerganov/llama.cpp/issues/374#issuecomment-1480719278

SHA256 checksums correctness

yes it might be good to differentiate ones as some have short fur and some long and some are more friendly than others. but llamas will always be the llamas...

Alpaca 7B faults on both macOS arm64 and Linux ppc64le

Are your model files correct? sha256sum of `ggml-alpaca-7b-q4.bin` should be ~~`8d5562ec1d8a7cfdcf8985a9ddf353339d942c7cf52855a92c9ff59f03b541bc`~~ e: this was probably wrong (some of?) the hashes currently found in the SHA256SUMS file are wrong https://github.com/ggerganov/llama.cpp/issues/374

Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC

It seems I'm too tired to find the button for converting to a draft, but anyway. The _cvtss_sh and _cvtss_ss intrinsics are still missing and not implemented yet, so don't...

Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC

I haven't checked out the compiled output at the disassembly level yet, so especially in the case of F16C there is the consideration as to which extent the compiler had...

Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC

> Avx2/avx512 also implies all the simd instructions being enabled like sse3 Yeah, but the `__SSE3__` wasn't currently used as `__AVX__` takes precedence over it, so I didn't add it...

Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC

> does that macro even exist? https://learn.microsoft.com/en-us/cpp/preprocessor/predefined-macros?view=msvc-160 It doesn't, that is the entire point.

Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC

> Do you observe improved performance with this change? I'll have to take a in-depth look later analysing the binary code and timing the performance, until then no idea. In...

Enable Fused-Multiply-Add (FMA) and F16C/CVT16 vector extensions on MSVC

Huge thanks @nicknitewolf and @KASR providing some statistics. :+1: :partying_face: I've concluded that unfortunately as my CPU is dog and only has 4 threads total, I can't provide useful statistics...