wllama icon indicating copy to clipboard operation
wllama copied to clipboard

WebAssembly binding for llama.cpp - Enabling in-browser LLM inference

Results 16 wllama issues
Sort by recently updated
recently updated
newest added

Here's the function: ```typescript function parseModelUrl(url: string) { const urlPartsRegex = /(.*)-(\d{5})-of-(\d{5})\.gguf$/; const matches = url.match(urlPartsRegex); if (!matches || matches.length !== 4) return url; const baseURL = matches[1]; const paddedShardsAmount...

Resolves #42 Resolves #43 `loadModel()` now also accepts `Blob` or `File` TODO: add example

Currently a model can fail to load for a number of different reasons. However, the error raised seems to always be a general "failed to load" error. It would be...

Because I [made a typo](https://github.com/ngxson/wllama/issues/56) in the URL of a local model file I noticed something strange. It seems that invalid URL ended up in the `wllama_cache` anyway. I checked...

Something interesting occurred while upgrading to version 1.8.0. Previously, it had been throwing an "Out of Memory" error, but that issue has now been resolved. However, a new problem has...

The "next generation of node package manager" ==> https://jsr.io/

In your readme you mention: > Maybe doing a full RAG-in-browser example using tinyllama? I've been looking into a way to allow users to 'chat with their documents'. A popular...

~~Data is now passing as `Uint8Array`. We can do better by using Streams: https://developer.mozilla.org/en-US/docs/Web/API/Streams_API/Using_readable_streams~~ We are now using `Blob` which already provides a `ReadableStream`.

First, thanks for putting this project together! I modified `examples/basic/index.html` to use a more capable model: `https://huggingface.co/lmstudio-ai/gemma-2b-it-GGUF/resolve/main/gemma-2b-it-q4_k_m.gguf`, which is 1.5gb. Using [LM Studio](https://lmstudio.ai) on my laptop (with GPU Acceleration disabled),...

With the introduction of heapfs, we can now do more low-level things. The idea is to load File Blob directly into wllama's heap without creating any intermediate buffer. This will...