fakezeta
fakezeta
Llama2 and mistral base model are quite poor in embedding compared to sentence tranformer models like bert. Why not integrate [bert.cpp](https://github.com/skeskinen/bert.cpp) or [sentence-transformers](https://sbert.net/) for `api/embeddings` endpoint so we can have...
When offloading to iGPU UHD 770 in a docker from https://github.com/mudler/LocalAI (b2128) llama.cpp crashes with the following error: `The number of work-items in each dimension of a work-group cannot exceed...
**Description** This PR implements embedded template in models for transformers backend with option: ``` template: use_tokenizer_template: true ``` in the yaml file. This way the embedded chat template in the...
Added - Phi-3 `trust_remote_code: true` - Hermes 2 Pro Llama3 - Multilingual-E5-base embedding model with OpenVINO acceleration (CPU and XPU) - all-MiniLM-L6-v2 with OpenVINO acceleration (CPU and XPU) **[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**...
[BUG] Phi3 Medium int4 Runtime Error: probability tensor contains either `inf`, `nan` or element < 0
### 🐛 Describe the bug Hi, Running Phi3 Medium on LocalAI with OpenVINO backend I found that while the int8 quantization is working correctly, the int4 quant gives the following...
Add LocalAI project using OpenVINO for LLM inference.