anything-llm icon indicating copy to clipboard operation
anything-llm copied to clipboard

[BUG]: Embedding ollama+llama2-chinese , chinese docx failed

Open wilsonlv opened this issue 10 months ago • 2 comments

How are you running AnythingLLM?

Docker (local)

What happened?

微信图片_20240401162520 微信图片_20240401162525

Are there known steps to reproduce?

No response

wilsonlv avatar Apr 01 '24 08:04 wilsonlv

It would seem to indicate that the Ollama embedder returned a zero-length vector. Can you confirm if this embedder is processing text chunks properly manually first?

timothycarambat avatar Apr 01 '24 17:04 timothycarambat

image Yea, if I add a English docx,it successes

wilsonlv avatar Apr 02 '24 00:04 wilsonlv

Closing as wontfix, I did not notice prior but you have an LLM chat model llama2-chinese as your embedding model. LLMs cannot embed text. Please use an embedding model like nomic-text-embed or mxbai for Chinese lang support embeddings

timothycarambat avatar Apr 19 '24 22:04 timothycarambat