Alberto Cabrera Pérez issues

Results 6 issues of


                                            Alberto Cabrera Pérez

Device code not split in modules when using aspect::atomic64

https://github.com/intel/llvm/pull/12673 enabled tests for the opencl:fpga backend, which don't support 64bit atomics. The following tests, syclcompat/atomic/atomic_arith.cpp syclcompat/atomic/atomic_comp_exchange.cpp failed with the following output: ``` # .---command stderr------------ # | terminate called...

bug

confirmed

[SYCL][COMPAT] fixed atomic_compare_exchange_strong not using addressSpace template parameter

UNREACHABLE executed at llvm/lib/CodeGen/ValueTypes.cpp

### Describe the bug ``` clang++ -DGGML_SCHED_MAX_COPIES=4 -DGGML_USE_LLAMAFILE -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -I/sources/llama.cpp/. -O3 -DNDEBUG -std=gnu++11 -Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wunreachable-code-break -Wunreachable-code-return -Wmissing-prototypes -Wextra-semi -march=native -MD -MT CMakeFiles/llama.dir/unicode.cpp.o -MF...

bug

confirmed

sycl : implementation of reordered Q4_0 MMVQ for Intel GPUs

This PR extends the work introduced in https://github.com/ggml-org/llama.cpp/pull/12035. MMVQ Q4_0 now supports the block_q_t reorder layout. The improvements are reflected in Text generation. The improvement of PP512 in the DataMax...

ggml

SYCL

ggml-cpu: handle 3d tensors in repack mat_mul

While testing #16739, perplexities for LFM2 skyrocketed. @ggerganov pointed out that some matrix shapes would probably not be supported. LFM2 has some layers that have two batches, so MAT_MULs were...

ggml

ggml-cpu: arm64: q4_K repack gemm and gemv implementations (i8mm)

This PR improves q4_k_q8_k gemm and gemv in arm64 using i8mm and vecdot instructions. Tested on an Apple M4 Max: ### REPACK vs NO REPACK | model | backend |...

ggml