anything-llm
anything-llm copied to clipboard
Failed to upload files in workspace with failed to embeed error
LLM: ollama:latest Embedding: AnythingLLM Embedder Error Log:
[TELEMETRY SENT] { event: 'documents_embedded_in_workspace', distinctId: '4aea0721-0b3f-4014-81dd-d78c248f6b7d', properties: { LLMSelection: 'ollama', Embedder: 'native', VectorDbSelection: 'lancedb' } } -- Working testing.pdf -- -- Parsing content from pg 1 -- -- Parsing content from pg 2 -- -- Parsing content from pg 3 -- -- Parsing content from pg 4 -- [SUCCESS]: testing.pdf converted & ready for embedding.
Document testing.pdf uploaded processed and successfully. It is now available in documents. [TELEMETRY SENT] { event: 'document_uploaded', distinctId: '4aea0721-0b3f-4014-81dd-d78c248f6b7d', properties: {} } Adding new vectorized document into namespace test Chunks created from document: 11 [INFO] The native embedding model has never been run and will be downloaded right now. Subsequent runs will be faster. (~23MB)
Failed to load the native embedding model: TypeError: fetch failed at Object.fetch (node:internal/deps/undici/undici:11730:11) at processTicksAndRejections (node:internal/process/task_queues:95:5) at runNextTicks (node:internal/process/task_queues:64:3) at process.processImmediate (node:internal/timers:447:9) at async getModelFile (file:///app/server/node_modules/@xenova/transformers/src/utils/hub.js:470:24) at async getModelJSON (file:///app/server/node_modules/@xenova/transformers/src/utils/hub.js:574:18) at async Promise.all (index 0) at async loadTokenizer (file:///app/server/node_modules/@xenova/transformers/src/tokenizers.js:52:16) at async AutoTokenizer.from_pretrained (file:///app/server/node_modules/@xenova/transformers/src/tokenizers.js:3920:48) at async Promise.all (index 0) { cause: ConnectTimeoutError: Connect Timeout Error at onConnectTimeout (node:internal/deps/undici/undici:6869:28) at node:internal/deps/undici/undici:6825:50 at Immediate._onImmediate (node:internal/deps/undici/undici:6857:13) at process.processImmediate (node:internal/timers:476:21) { code: 'UND_ERR_CONNECT_TIMEOUT' } } TypeError: fetch failed at Object.fetch (node:internal/deps/undici/undici:11730:11) at processTicksAndRejections (node:internal/process/task_queues:95:5) at runNextTicks (node:internal/process/task_queues:64:3) at process.processImmediate (node:internal/timers:447:9) at async getModelFile (file:///app/server/node_modules/@xenova/transformers/src/utils/hub.js:470:24) at async getModelJSON (file:///app/server/node_modules/@xenova/transformers/src/utils/hub.js:574:18) at async Promise.all (index 0) at async loadTokenizer (file:///app/server/node_modules/@xenova/transformers/src/tokenizers.js:52:16) at async AutoTokenizer.from_pretrained (file:///app/server/node_modules/@xenova/transformers/src/tokenizers.js:3920:48) at async Promise.all (index 0) { cause: ConnectTimeoutError: Connect Timeout Error at onConnectTimeout (node:internal/deps/undici/undici:6869:28) at node:internal/deps/undici/undici:6825:50 at Immediate._onImmediate (node:internal/deps/undici/undici:6857:13) at process.processImmediate (node:internal/timers:476:21) { code: 'UND_ERR_CONNECT_TIMEOUT' } } addDocumentToNamespace fetch failed Failed to vectorize custom-documents/testing.pdf-36edd41c-d784-454f-a885-a4b73a558ce7.json [TELEMETRY SENT] { event: 'documents_embedded_in_workspace', distinctId: '4aea0721-0b3f-4014-81dd-d78c248f6b7d', properties: { LLMSelection: 'ollama', Embedder: 'native', VectorDbSelection: 'lancedb' } }
@stanltam Are you running AnythingLLM in a docker container on a MacBook with an M-series chip?
i'm the same error ,i run AnythingLLM in a docker container on a ubuntu system
@bioone This error is not AnythingLLM. Its a timeout on xenova/transformers downloading from huggingface. If you are having trouble embedding with the native embedder then you can swap to any other provider and re-embed documents.
Will look into if there is a way we can either pre-bake the native embedder into the image or increase the timeout
@timothycarambat OK, I hope you can pre-bake the native embedder, it's very useful!!!
Hi there! Any updates on the issue? I still encounter the problem to embed PDF files in workspace (using LanceDB). I'm using LM Studio with Anything LLM...
I deployed using Docker, but encountered an error when importing files, indicating a timeout.
[INFO] The native embedding model has never been run and will be downloaded right now. Subsequent runs will be faster. (~23MB)
Failed to load the native embedding model: TypeError: fetch failed
at Object.fetch (node:internal/deps/undici/undici:11731:11)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5)
at async getModelFile (file:///app/server/node_modules/@xenova/transformers/src/utils/hub.js:471:24)
at async getModelJSON (file:///app/server/node_modules/@xenova/transformers/src/utils/hub.js:575:18)
at async Promise.all (index 0)
at async loadTokenizer (file:///app/server/node_modules/@xenova/transformers/src/tokenizers.js:61:18)
at async AutoTokenizer.from_pretrained (file:///app/server/node_modules/@xenova/transformers/src/tokenizers.js:4296:50)
at async Promise.all (index 0)
at async loadItems (file:///app/server/node_modules/@xenova/transformers/src/pipelines.js:3115:5)
at async pipeline (file:///app/server/node_modules/@xenova/transformers/src/pipelines.js:3055:21) {
cause: ConnectTimeoutError: Connect Timeout Error
at onConnectTimeout (node:internal/deps/undici/undici:6869:28)
at node:internal/deps/undici/undici:6825:50
at Immediate._onImmediate (node:internal/deps/undici/undici:6857:13)
at process.processImmediate (node:internal/timers:476:21) {
code: 'UND_ERR_CONNECT_TIMEOUT'
}
}
addDocumentToNamespace fetch failed
@warrior-dl see pinned issue. HF is blocking your IP https://github.com/Mintplex-Labs/anything-llm/issues/821
@warrior-dl see pinned issue. HF is blocking your IP #821
Thank you very much, I successfully imported the file after manually downloading the model.
@warrior-dl Hi, could you please tell me how to manually download the embedding model? Where should the model be downloaded?
@warrior-dl Hi, could you please tell me how to manually download the embedding model? Where should the model be downloaded?
https://github.com/Mintplex-Labs/anything-llm/issues/821#issuecomment-1968382359