fakezeta issues

Results 6 issues of


                                            fakezeta

[enhancement] use bert.cpp for /api/embeddings

Llama2 and mistral base model are quite poor in embedding compared to sentence tranformer models like bert. Why not integrate [bert.cpp](https://github.com/skeskinen/bert.cpp) or [sentence-transformers](https://sbert.net/) for `api/embeddings` endpoint so we can have...

SYCL backend error PI_ERROR_INVALID_WORK_GROUP_SIZE on iGPU UHD 770

When offloading to iGPU UHD 770 in a docker from https://github.com/mudler/LocalAI (b2128) llama.cpp crashes with the following error: `The number of work-items in each dimension of a work-group cannot exceed...

bug-unconfirmed

Transformer Backend: Implementing use_tokenizer_template and stop_prompts options

**Description** This PR implements embedded template in models for transformers backend with option: ``` template: use_tokenizer_template: true ``` in the yaml file. This way the embedded chat template in the...

gallery: Added some OpenVINO models

Added - Phi-3 `trust_remote_code: true` - Hermes 2 Pro Llama3 - Multilingual-E5-base embedding model with OpenVINO acceleration (CPU and XPU) - all-MiniLM-L6-v2 with OpenVINO acceleration (CPU and XPU) **[Signed commits](../CONTRIBUTING.md#signing-off-on-commits-developer-certificate-of-origin)**...

area/ai-model

[BUG] Phi3 Medium int4 Runtime Error: probability tensor contains either `inf`, `nan` or element < 0

### 🐛 Describe the bug Hi, Running Phi3 Medium on LocalAI with OpenVINO backend I found that while the int8 quantization is working correctly, the int4 quant gives the following...

bug

Add LocalAI

Add LocalAI project using OpenVINO for LLM inference.