llama-node
llama-node copied to clipboard
[Error: Could not load model] { code: 'GenericFailure' }
Getting this error [Error: Could not load model] { code: 'GenericFailure' }
when trying to load a model:
$ node ./bin/llm/llm.js --model ~/models/gpt4-alpaca-lora-30B.ggml.q5_1.bin
[Error: Could not load model] { code: 'GenericFailure' }
I've modified the example a bit to take an argument as --model
import minimist from 'minimist';
import { LLM } from "llama-node";
import { LLamaRS } from "llama-node/dist/llm/llama-rs.js";
import path from "path";
const args = minimist(process.argv.slice(2));
const modelPath = args.model;
const model = path.resolve(modelPath);
const llama = new LLM(LLamaRS);
const template = `how are you`;
const prompt = `Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
${template}
### Response:`;
const params = {
prompt,
numPredict: 128,
temp: 0.2,
topP: 1,
topK: 40,
repeatPenalty: 1,
repeatLastN: 64,
seed: 0,
feedPrompt: true,
};
const run = async () => {
try {
await llama.load({ path: model });
await llama.createCompletion(params, (response) => {
process.stdout.write(response.token);
});
} catch (err) {
console.error(err);
}
};
run();
try using the llama.cpp backend, i think it supports more model types than llm-rs
q5_1 may be supported later. I have not upgrade llm-rs backend for it
my models work fine with llm-rs
@hlhr202 what is q5_1 mean?
try using the llama.cpp backend, i think it supports more model types than llm-rs
how do i do this?
it is a type of ggml model, you can check it on llama.cpp github
try using the llama.cpp backend, i think it supports more model types than llm-rs
how do i do this?
Check it here: https://llama-node.vercel.app/docs/backends/ and here: https://llama-node.vercel.app/docs/backends/llama.cpp/inference
I think this issue also need to investigate the llama.cpp lora support. but I m still reading the llama.cpp implementation. probably will bring this feature soon.
@ralyodio by now, q5_1 model is supported for llama.cpp backend here