Max Krasnyansky
Max Krasnyansky
@ggerganov Thanks for fixing up q8_0_q8_0 (good eyes, it was a cut&paste error that I missed and CI didn't catch). Should be good to merge now. I have more updates...
@hmartinez82 > 8cx Gen 3 Interesting. I didn't know int8 matmul works on 8cx gen3. That's great! Can you please try running armv8.7-a compiled binaries as is? It might just...
@hmartinez82 If you use the same quantization (q4_0) for both llama 2 and 3 then they would both use matmul-int8 (if enabled). It's probably crashing in some other code path...
@ggerganov Any objections to merging this? Please let me know if you have any questions/suggestions.
> Could you add some documentation about how to use the `CMakePresets.json` file? A comment in the PR description is enough. If I understand correctly, this is not being used...
@slaren Please don't forget to hit that merge button :) Would be good to avoid further rebases while all checks are passing. I wanted to retest released binaries and will...
> The `CMakePresets.json` file has been giving me issues. Visual Studio Code is available on all OSs and this is setup specifically for Windows. I'm now greeted with a prompt...
> @max-krasnyansky These are usually auto-generated, but can be hand-crafted. Please see the CMake documentation link I included above. And yes, the things you listed are windows specific, that's the...
@lhez please take a look. It makes sense to add multi-device support. @linehill please rebase once we merge #12886 when you get the chance
> [@max-krasnyansky](https://github.com/max-krasnyansky) , [@lhez](https://github.com/lhez) as people who seem to know about the OpenCL backend, do you have an opinion on the right way to fix this? Oh. Missed this one...