LLaMPPL
LLaMPPL copied to clipboard
Update to more recent ggml format
I put your changes of llama.cpp into the most recent llama.cpp
Then I had to modify LLaMPPL/llamppl/llama_cpp.py to use the new code from llama_cpp_python, you can see the new file here
Probably the easier change on your end is to pull changes of your llama_cpp branch from main and edit llama_cpp.py, but here's these if needed
Edit: hmm I'm having this issue with my changes when I offload to gpu, hold on lemme look into it:
GGML_ASSERT: C:\...\llama-cpp-python\vendor\llama.cpp\ggml.c:15154: tensor->src0->backend == GGML_BACKEND_CPU
Edit edit: nvm those are just bc eval_multi doesn't have gpu support yet