Results 61 comments of Meng, Hengyu

@shibe2 I agree it is more of a feature enhancement. I think it will be quite useful if llama.cpp can calculate the appropriate n_ctx especially for serving, any plans on...

@ClarkChin08 can you attach the measurements results? like llama3-70B on 8 GPUs, memory consumption on each GPU, performance?

I do some searching and found the related issues on intel llvm repo @MrSidims I saw the similar issues https://github.com/intel/llvm/pull/4025#issuecomment-870823000, could you give us some education?

ok, can you run the ```tanh``` alone and see whether it will crash each time? ``` .\bin\test-backend-ops -b SYCL0 -o TANH ```

It seems there might be a more fundamental issue causing this. As a temporary solution, could you please try updating your driver, operating system, kernel, and oneAPI? This might address...

hi @MrSidims thank you for your quick reply. I can confirm no AOT option set currently. the whole compilation command is the following: https://github.com/ggerganov/llama.cpp/blob/0d2c7321e9678e91b760ebe57f0d063856bb018b/ggml/src/CMakeLists.txt#L465-L518 >If this compilation happens in JIT...