frob comments

Results 840 comments of


                                            frob

embedding generation failed. wsarecv: An existing connection was forcibly closed by the remote host.

viosay/conan-embedding-v1 has an embedding length of 1024 and your test text is 1905 bytes, so it's exceeding the window. The client should chunk the text to segments smaller than the...

embedding generation failed. wsarecv: An existing connection was forcibly closed by the remote host.

ollama moved to a more recent llama.cpp snapshot for the granite model support (https://github.com/ollama/ollama/commit/f2890a4494f9fb3722ee7a4c506252362d1eab65) and presumably that has introduced some problems with embedding calls. I don't see any recent issues...

embedding generation failed. wsarecv: An existing connection was forcibly closed by the remote host.

Just to correct a mistake I made, viosay/conan-embedding-v1 has a limit of 512 tokens, and shaw/dmeta-embedding-zh a limit of 1024 tokens. The embedding length is the size of the generated...

embedding generation failed. wsarecv: An existing connection was forcibly closed by the remote host.

Tokens are different to characters. A token is a sequence of characters, on average 2 or 3 characters in length. So a token length of 512 would handle 1024-1536 characters,...

embedding generation failed. wsarecv: An existing connection was forcibly closed by the remote host.

It does truncate, it's just the runner throws an `GGML_ASSERT(i01 >= 0 && i01 < ne01) failed` exception and crashes when the number of tokens is close to the maximum...

embedding generation failed. wsarecv: An existing connection was forcibly closed by the remote host.

It works if I set `num_ctx` to 512. Perhaps the lightrag framework is adding extra tokens, or [there is an issue wiith `chunk_token_size`](https://github.com/HKUDS/LightRAG/issues/102). ```console $ ollama show milkey/gte:large-zh-f16 Model arch...

embedding generation failed. wsarecv: An existing connection was forcibly closed by the remote host.

The tokenizer used by OpenaAI is different to the tokenizer used by conan-embedding-v1. You can see from your screenshots that all three OpenAI models return a different token count for...

embedding generation failed. wsarecv: An existing connection was forcibly closed by the remote host.

Just to summarize the content from above: The problem is that the context length that ollama is using is longer than the context length that the embedding model supports. If...

embedding generation failed. wsarecv: An existing connection was forcibly closed by the remote host.

To dig a bit deeper: the root cause is a mis-calculation in the truncation logic. The prompt is truncated to `num_ctx` at the entry point of the API, but further...

embedding generation failed. wsarecv: An existing connection was forcibly closed by the remote host.

The data in the log are characters, not tokens. The padding is a function of the tokenizer table in shaw/dmeta-embedding-zh. The tokenizer uses [sentencepiece](https://huggingface.co/docs/transformers/en/tokenizer_summary#sentencepiece) and the spaces are represented internally...