Erik Scholz comments

Results 282 comments of


                                            Erik Scholz

ci: add linux binaries to release build

Image, hmm. installing cuda now only takes as long as the compile ~1min. so i dont really see the point of using docker (im assuming thats what you mean with...

ci: add linux binaries to release build

https://github.com/Jimver/cuda-toolkit/issues/249 this made installing not the full toolkit viable (without me manually installing the apt sources :smile: )

ci: add linux binaries to release build

It would be cool for windows build, those take for ever. but for linux builds is now

Use F16 for memory_k and memory_v (as suggested in #146)

can confirm `ggml ctx size` 4529.34 MB -> 4273.34 MB speed stayed the same. it is hard to tell if the quality changes, but the prediction does (obviously).

Use F16 for memory_k and memory_v (as suggested in #146)

I ran some more, non scientific tests: 7B: ![image](https://user-images.githubusercontent.com/2938071/225655593-65f9ad27-f1d3-41cb-8b59-328639f6c137.png) 30B: ![image](https://user-images.githubusercontent.com/2938071/225655650-b01d9167-0535-4400-841a-72454b28b392.png) both where ran with `-t 4 -n 2048 --repeat_penalty 1.176 --repeat_last_n 256 --temp 0.8 --top_p 0.1 -c 2048 --color...

Use F16 for memory_k and memory_v (as suggested in #146)

@ty-everett are you going to write the cli-param conditional version? if not, I will do it.

Added ggml as submodule

@ggerganov personally been using submodules since forever, but i recently came across this post https://diziet.dreamwidth.org/14666.html

Tanh is not implemented

> Yes, it does. Not sure if it is an optimized code. Most compilers will detect the simple for loop and might unroll it or use simd to make it...

ggml : unified file format

technically speaking, we also had a GGMFv1, the one before the memory mapped GGJTv1

ggml : unified file format

there is also the new .ggml wip file, which contains the computation graph. https://github.com/ggerganov/ggml/commit/3b697a2264c5dd132abb3268f6b1091536f3f9ff