llama-node
llama-node copied to clipboard
Can't run example on llama-2-13b-chat q4_0
I apologize in advance if I omit any useful details, I'm just a simple dev with no knowledge or understanding in DS and therefore I'm in trial and error land.
I followed the instructions from llama.cpp on the llama-2-13b-chat model, and I now have the q4_0 file: llama-2-13b-chat/ggml-model-q4_0.gguf
.
I use the example code from this repo and of course have changed it to point to the model file, but loading fails:
The code:
import { LLM } from 'llama-node';
import { LLamaCpp } from 'llama-node/dist/llm/llama-cpp.js';
import path from 'path';
const model = path.resolve(
process.cwd(),
'../llama.cpp/models/llama-2-13b-chat/ggml-model-q4_0.gguf',
);
console.log(model);
const llama = new LLM(LLamaCpp);
/** @type {import('llama-node/dist/llm/llama-cpp').LoadConfig} */
const config = {
modelPath: model,
enableLogging: true,
nCtx: 1024,
seed: 0,
f16Kv: false,
logitsAll: false,
vocabOnly: false,
useMlock: false,
embedding: false,
useMmap: true,
nGpuLayers: 128,
};
const template = `How are you?`;
const prompt = `A chat between a user and an assistant.
USER: ${template}
ASSISTANT:`;
const params = {
nThreads: 4,
nTokPredict: 2048,
topK: 40,
topP: 0.1,
temp: 0.2,
repeatPenalty: 1,
prompt,
};
const run = async () => {
await llama.load(config);
await llama.createCompletion(params, response => {
process.stdout.write(response.token);
});
};
run();
The error:
Debugger listening on ws://127.0.0.1:59899/c72280cb-a098-4c15-859f-54025e513896
For help, see: https://nodejs.org/en/docs/inspector
Debugger attached.
/Users/gioraguttsait/Git/personal-repos/llm/llama.cpp/models/llama-2-13b-chat/ggml-model-q4_0.gguf
llama.cpp: loading model from /Users/gioraguttsait/Git/personal-repos/llm/llama.cpp/models/llama-2-13b-chat/ggml-model-q4_0.gguf
error loading model: unknown (magic, version) combination: 46554747, 00000001; is this really a GGML file?
llama_init_from_file: failed to load model
Waiting for the debugger to disconnect...
node:internal/process/promises:288
triggerUncaughtException(err, true /* fromPromise */);
^
[Error: Failed to initialize LLama context from file: /Users/gioraguttsait/Git/personal-repos/llm/llama.cpp/models/llama-2-13b-chat/ggml-model-q4_0.gguf] {
code: 'GenericFailure'
}
Node.js v18.17.1
I can see that the error refers to some constants which it doesn't expect in the file (error loading model: unknown (magic, version) combination: 46554747, 00000001; is this really a GGML file?
), and I see that it's a gguf file and not a ggml one.
From a quick google search, I got to this post on r/LocalLLaMA
which stats that gguf is sort of a successor to ggml.
I have literally 0 understanding of what I'm doing, and would appreciate if someone could point me in some direction of how to deal with it. Even just pointing out keywords I might have missed which could have led me to find a better answer in the first place 😅
Thanks in advance for your time!
Exact same issue here. Did you manage to find a work around? I might be wrong, but it doesn't look like this library's llama-cpp has been updated in ~4 months. I wonder if that's the issue.
is there a way to overlay a newest version of llama?