David-AU-github comments

Results 23 comments of


                                            David-AU-github

Adds a way of merging models with different sizes(B)

IS this in the main branch? avail? This can solve an issue merging 7b and 13b models. As well as 10.7/11b and 13b models too. And issues with 20B /...

Adds a way of merging models with different sizes(B)

Thank you ; I will try this out - got the two files and will give it a go. My case is the opposite -> Expanding a model. First ->...

No special token handling in imatrix, beam-search and others

Try to quantize with flag: --leave-output-tensor For iq3xs ... may help? This flag will raise file size slightly, but keep output tensors at original fp16/fp32 regardless of imat or reg...

b2447 (c47cf41) decreased output quality

I have noticed the same issue - > conducted tests as follows: (via LmStudio) 1 - 6 Long form output (one prompt, no regen, one shot) -> GPU (Cuda/Nvidia) 2...

b2447 (c47cf41) decreased output quality

Here is the prompt and method to reproduce the results. For clarity GPU only and CPU only. (I can also create a PDF with the results too, as per test...

b2447 (c47cf41) decreased output quality

my two cents here: "With the patch" -> Word choice is more nuanced, and precise. Sentence structure is also somewhat better. It is definitely higher quality. That being said "general...

b2447 (c47cf41) decreased output quality

> Regarding the patch, on further thought, the computation is correct even without it since we handle the "leftover" elements in the last non-64 block: > > https://github.com/ggerganov/llama.cpp/blob/8cc91dc63c0df397d644a581b2cbeea74eb51ae0/ggml.c#L1537-L1541 > >...

David-AU-github

Adds a way of merging models with different sizes(B)

Adds a way of merging models with different sizes(B)

No special token handling in imatrix, beam-search and others

b2447 (c47cf41) decreased output quality

b2447 (c47cf41) decreased output quality

b2447 (c47cf41) decreased output quality

b2447 (c47cf41) decreased output quality

b2447 (c47cf41) decreased output quality

For CUDA versions < 11.7 a target CUDA architecture must be explicitly provided via CUDA_DOCKER_ARCH

The DARE-TIES experiment.