fakezeta

Results 6 issues of fakezeta

Llama2 and mistral base model are quite poor in embedding compared to sentence tranformer models like bert. Why not integrate [bert.cpp](https://github.com/skeskinen/bert.cpp) or [sentence-transformers](https://sbert.net/) for `api/embeddings` endpoint so we can have...

When offloading to iGPU UHD 770 in a docker from https://github.com/mudler/LocalAI (b2128) llama.cpp crashes with the following error: `The number of work-items in each dimension of a work-group cannot exceed...

bug-unconfirmed

**Description** This PR implements embedded template in models for transformers backend with option: ``` template: use_tokenizer_template: true ``` in the yaml file. This way the embedded chat template in the...

Added - Phi-3 `trust_remote_code: true` - Hermes 2 Pro Llama3 - Multilingual-E5-base embedding model with OpenVINO acceleration (CPU and XPU) - all-MiniLM-L6-v2 with OpenVINO acceleration (CPU and XPU) **[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**...

area/ai-model

### 🐛 Describe the bug Hi, Running Phi3 Medium on LocalAI with OpenVINO backend I found that while the int8 quantization is working correctly, the int4 quant gives the following...

bug

Add LocalAI project using OpenVINO for LLM inference.