Alberto Cabrera Pérez

Results 6 issues of Alberto Cabrera Pérez

https://github.com/intel/llvm/pull/12673 enabled tests for the opencl:fpga backend, which don't support 64bit atomics. The following tests, syclcompat/atomic/atomic_arith.cpp syclcompat/atomic/atomic_comp_exchange.cpp failed with the following output: ``` # .---command stderr------------ # | terminate called...

bug
confirmed

### Describe the bug ``` clang++ -DGGML_SCHED_MAX_COPIES=4 -DGGML_USE_LLAMAFILE -D_GNU_SOURCE -D_XOPEN_SOURCE=600 -I/sources/llama.cpp/. -O3 -DNDEBUG -std=gnu++11 -Wmissing-declarations -Wmissing-noreturn -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function -Wunreachable-code-break -Wunreachable-code-return -Wmissing-prototypes -Wextra-semi -march=native -MD -MT CMakeFiles/llama.dir/unicode.cpp.o -MF...

bug
confirmed

This PR extends the work introduced in https://github.com/ggml-org/llama.cpp/pull/12035. MMVQ Q4_0 now supports the block_q_t reorder layout. The improvements are reflected in Text generation. The improvement of PP512 in the DataMax...

ggml
SYCL

While testing #16739, perplexities for LFM2 skyrocketed. @ggerganov pointed out that some matrix shapes would probably not be supported. LFM2 has some layers that have two batches, so MAT_MULs were...

ggml

This PR improves q4_k_q8_k gemm and gemv in arm64 using i8mm and vecdot instructions. Tested on an Apple M4 Max: ### REPACK vs NO REPACK | model | backend |...

ggml