maksi65432
maksi65432
it likely moment when clang gone wrong from clang 17 it sometimes have this bug
you can at least try but the standart llama.cpp build is not building it in gpu compability because you need to first read build docs https://github.com/ggml-org/llama.cpp/blob/master/docs/build.md
if it not BitNet model it likely means that conversion gone lightly wrong it can just be that one of the layers got the additional neuron connected that cause it,...
i thinking it either problem with tokenizer or problem with not use NL (nonlinear) quantization by bitnet team. And i recommend try to not use -cnv flag because in docs...
Oh, then I see it the error of Pipeline processing empty responses. It debug able only by add jng a lot of debug into Pipeline and model it self. :(
For me worked: https://packages.debian.org/bookworm/libsystemd-dev