node-llama-cpp
node-llama-cpp copied to clipboard
Run AI models locally on your machine with node.js bindings for llama.cpp. Enforce a JSON schema on the model output on the generation level
### Issue description LlamaCpp crash when embedding ### Expected Behavior The code generates correct embedding vector. ### Actual Behavior LlamaCpp crashed with error code: ``` zsh: segmentation fault node server/test.js...
### Issue description When bundling `node-llama-cpp` with webpack and Typescript, there's something weird happening: Webpack somehow appears to load the module as a promise. After that is resolved, everything works...
## How to use this beta To install the beta version of `node-llama-cpp`, run this command inside of your project: ```bash npm install node-llama-cpp@beta ``` To get started quickly, generate...
Support creating these types of projects: * Node with TypeScript using `vite-node` * Electron app with TypeScript * Node with plain JavaScript
### Feature Description LlamaCPP is able to cache prompts to a specific file via the "--prompt-cache" flag. I think that exposing this through node-llama-cpp would provide for some techniques for...
Also, automatically set the right `contextSize` and provide other good defaults to make the usage smoother. * Support configuring the context swapping size for infinite text generation (by default, it'll...
When `llama.cpp`'s support for this will be stable. Hopefully, there will be an official API for this after https://github.com/ggerganov/llama.cpp/issues/9643 is implemented.
### Feature Description Can change LoRA dynamically after loading LLaMa model. ### The Solution See `llama_model_apply_lora_from_file()` function in `llama.cpp`. https://github.com/ggerganov/llama.cpp/blob/e9c13ff78114af6fc6a4f27cc8dcdda0f3d389fb/llama.h#L353C1-L359C1 ### Considered Alternatives None. ### Additional Context _No response_ ###...
### Description of change * feat: split gguf files support * feat: `pull` command * feat: `stopOnAbortSignal` and `customStopTriggers` on `LlamaChat` and `LlamaChatSession` * feat: `checkTensors` parameter on `loadModel` *...