Victor Nogueira
Victor Nogueira
Something interesting occurred while upgrading to version 1.8.0. Previously, it had been throwing an "Out of Memory" error, but that issue has now been resolved. However, a new problem has...
### System Info Transformers.js v3.7.6 running on Linux. The issue is related to the conversion script only, with runs on Python 3.12. ### Environment/Platform - [ ] Website/web-app - [...
When running `Meta-Llama-3.1-8B-Instruct-Q2_K-(shards).gguf` in the [demo](https://huggingface.co/spaces/ngxson/wllama), it's failing with the following error: Model, for reference (loaded with 4096-tokens context):
For the record, the v2.3.2 update (https://github.com/ngxson/wllama/pull/179) made some large models (2.3 GB+ [Gemma 3 4B, Qwen 3 4B and Llama 3.1 Nemotron Nano 4b, all at Q4_K_S with 4096...
Makes it easier for users to diagnose and fix issues, as suggested on #99.
- Resolves #53