Ivan Komarov

Results 49 comments of Ivan Komarov

I think these changes looks great. You said [elsewhere](https://github.com/ggerganov/llama.cpp/issues/1129#issuecomment-1526057911) that this stuff might cause "some friction", but I think it turns out to be very non-intrusive. The CUDA stuff is...

@slaren Yeah, I totally agree with your overall message -- adding new quantization methods (which, judging by the issues/discussions, will keep appearing) by porting reference implementation should be easy, and...

@SlyEcho > On a sidenote, there could also be a way to unify the CL and CUDA/ROCm kernels using some "clever" techniques Oh wow, I managed to totally miss the...

> When I re-ran the tests with the kernel in this PR prompt processing was ~57% faster compared to master. Whoa, this is nice to hear. Unfortunately I don't have...

> But more importantly, I think that my kernel is simpler Yup, I think this your version is pretty much what I started with. The additional complexity in my version...

Closing this, since this PR is outdated and largely superseded by @JohannesGaessler's [efforts](https://github.com/ggerganov/llama.cpp/pull/1341#issuecomment-1546690422).

I also stumbled upon this weird error. It turns out that [child_process.spawn()](https://nodejs.org/api/child_process.html#child_processspawncommand-args-options) returns `ENOENT` if the current working directory of the process doesn't exist, even if the process binary itself...

@GaetanLepage > Would it still be feasible to create a tag for this release ? JFYI: the tag is not likely to help here, since the commit history for the...

> Feel free to close this issue then. Uh, sorry, I worded that confusingly. :/ Having a `v2.2.0` tag would be very valuable, exactly because it's supposed to be immutable...

> Can you confirm that it works? It does, thank you! I just closed issue #3525. Regarding this issue: given that the branch history is restored, I think it is...