Alberto Cabrera Pérez comments

Results 27 comments of


                                            Alberto Cabrera Pérez

ggml-cpu: handle 3d tensors in repack mat_mul

@ggerganov I've addressed all your comments. Let me know if something else is required.

ggml-cpu: handle 3d tensors in repack mat_mul

Mmm. Let's revert this then. I will reopen a PR with the branch as a draft and we can have a better solution. I'd rather not introduce a regression upstream....

ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm)

@ggerganov is there something else needed from my side or are we waiting another review?

ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm)

I was able to replicate the PPL skyrocketing with the generic implementation as well: ``` # ggml_gemm_q4_K_8x8_q8_K_generic perplexity: 34.48 seconds per pass - ETA 1.43 minutes [1]9.6770,[2]1762.7802,[3]9505.4348,[4]22802.6452,[5]5311.2750,[6]10333.9703,[7]16582.8044,[8]23315.3388,[9]11093.7993,[10]14942.7293, # ggml_gemm_q4_K_8x8_q8_K perplexity:...

ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm)

I've opened #17030 for the fix. > Hm yes - `Q4_0` with LFM is indeed also problematic. However `Q4_0` with llama 3.1 8B is good. So this means there is...

ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm)

@ggerganov https://github.com/ggml-org/llama.cpp/pull/17241 fixed the perplexity issues. So this PR is again ready for review (It's rebased on top of master).

ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm)

@ggerganov sorry for pinging again! I don't have merge rights. Could you please?

ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm)

Ah, Sorry for the misunderstanding! I got another merged in with a single review and didn't realize both approvals were needed. Thanks!

[SYCL][COMPAT] Re-add buffer (USM_LEVEL_NONE) support

LGTM, great work @joeatodd

sycl: use oneDNN for matrices multiplication

@lslusarczyk Mind rebasing on top of master? I'm seeing very bad performance in this PR, but it seems to be related to not having #13343 in this branch. A previous...