Flowise icon indicating copy to clipboard operation
Flowise copied to clipboard

Faiss - Error: ID undefined not found.

Open noblerboy2004 opened this issue 5 months ago • 14 comments

Describe the bug

Image

I'm using faiss with localai embedding with 🦙 llama.cpp Python API With old version of flowise, faiss with similar search working well. However, after upgrade to lastest version yesterday, I'm facing error ID undefined not found. when using faiss with localai embedding. Server llama.cpp Python API is same.

When I tried top K with value 1, the similar search working well.

Image

To Reproduce

  1. Using Faiss with localai embedding
  2. Using llama.cpp Python API for êmbedding
  3. Upset document with any kind of split
  4. Try top K with 2

Expected behavior

Error: ID undefined not found. with top K is greater than 1

Screenshots

No response

Flow

No response

Use Method

Docker

Flowise Version

3.0.4

Operating System

Linux

Browser

Chrome

Additional context

No response

noblerboy2004 avatar Aug 01 '25 08:08 noblerboy2004

The log of flowise for this bug

2025-08-01 17:02:17.153 | 2025-08-01 10:02:17 [INFO]: ⬆️ POST /api/v1/internal-prediction/8537cf77-f5d0-40d3-875d-c45856a230c3 2025-08-01 17:02:17.424 | 2025-08-01 10:02:17 [ERROR]: [server]: Error: ID undefined not found. 2025-08-01 17:02:17.424 | Error: ID undefined not found. 2025-08-01 17:02:17.424 | at SynchronousInMemoryDocstore.search (/usr/local/lib/node_modules/flowise/node_modules/langchain/dist/stores/doc/in_memory.cjs:85:19) 2025-08-01 17:02:17.424 | at /usr/local/lib/node_modules/flowise/node_modules/flowise-components/dist/nodes/vectorstores/Faiss/Faiss.js:120:38 2025-08-01 17:02:17.424 | at Array.map () 2025-08-01 17:02:17.424 | at similaritySearchVectorWithScore (/usr/local/lib/node_modules/flowise/node_modules/flowise-components/dist/nodes/vectorstores/Faiss/Faiss.js:118:26) 2025-08-01 17:02:17.424 | at vectorStore.similaritySearchVectorWithScore (/usr/local/lib/node_modules/flowise/node_modules/flowise-components/dist/nodes/vectorstores/Faiss/Faiss.js:96:26) 2025-08-01 17:02:17.424 | at FaissStore.similaritySearch (/usr/local/lib/node_modules/flowise/node_modules/@langchain/core/dist/vectorstores.cjs:264:36) 2025-08-01 17:02:17.424 | at process.processTicksAndRejections (node:internal/process/task_queues:95:5) 2025-08-01 17:02:17.424 | at async VectorStoreRetriever.getRelevantDocuments (/usr/local/lib/node_modules/flowise/node_modules/@langchain/core/dist/retrievers/index.cjs:121:29) 2025-08-01 17:02:17.424 | at async VectorStoreRetriever._streamIterator (/usr/local/lib/node_modules/flowise/node_modules/@langchain/core/dist/runnables/base.cjs:173:9) 2025-08-01 17:02:17.424 | at async VectorStoreRetriever.transform (/usr/local/lib/node_modules/flowise/node_modules/@langchain/core/dist/runnables/base.cjs:410:9) 2025-08-01 17:02:17.424 | at async RunnableSequence._streamIterator (/usr/local/lib/node_modules/flowise/node_modules/@langchain/core/dist/runnables/base.cjs:1369:30)

noblerboy2004 avatar Aug 01 '25 10:08 noblerboy2004

I found out flowise version 2.2.6 working well with all models. Even the search quality is better then latest version. In the latest version, output of search is the same for all query. Terrible

noblerboy2004 avatar Aug 05 '25 06:08 noblerboy2004

yeah... this one definitely falls under our diagnostic list ~ Problem No.13: post‑update retrieval collapse (FAISS layer).

it usually shows up when FAISS vectorstore updates silently break internal ID mappings after model upgrade or config shift. from your logs, looks like the top‑k>1 search is triggering a misaligned reference to non-existent vector IDs. fallback to k=1 working fine just confirms it's not ingestion, but ID indexing mismatch.

we've seen this pattern across several pipelines recently. our tooling has a fix (MIT licensed, also got a star from the tesseract.js creator), let me know if you want the patch or the full module. happy to share.

onestardao avatar Aug 07 '25 03:08 onestardao

yeah... this one definitely falls under our diagnostic list ~ Problem No.13: post‑update retrieval collapse (FAISS layer).

it usually shows up when FAISS vectorstore updates silently break internal ID mappings after model upgrade or config shift. from your logs, looks like the top‑k>1 search is triggering a misaligned reference to non-existent vector IDs. fallback to k=1 working fine just confirms it's not ingestion, but ID indexing mismatch.

we've seen this pattern across several pipelines recently. our tooling has a fix (MIT licensed, also got a star from the tesseract.js creator), let me know if you want the patch or the full module. happy to share.

Hi onestardao,

Thank you for your response. Please help to share tool to fix thif problem. Path or full module is ok. I'm using docker.

Thank you.

noblerboy2004 avatar Aug 07 '25 16:08 noblerboy2004

yep this one maps directly to Problem No.13: Multi-Agent Chaos, as defined in the WFGY diagnostic map.

the faiss module update broke internal mappings, likely due to an ID misalignment between the store and runtime. the ID undefined error isn't the root cause, it's a downstream symptom of a corrupted reference loop across agents (common in multi-module setups like flowise). we've debugged this in multiple RAG systems, and the core issue is always the same: missing boundary reset or agent sync on patch updates.

we’ve documented this exact failure here, along with tested fixes (both soft fallback and module-level patch): 🔧 https://github.com/onestardao/WFGY/blob/main/ProblemMap/multi-agent-chaos.md

happy to share a lightweight module if you want to patch this locally. just ping us if needed.

onestardao avatar Aug 08 '25 01:08 onestardao

yep this one maps directly to Problem No.13: Multi-Agent Chaos, as defined in the WFGY diagnostic map.

the faiss module update broke internal mappings, likely due to an ID misalignment between the store and runtime. the ID undefined error isn't the root cause, it's a downstream symptom of a corrupted reference loop across agents (common in multi-module setups like flowise). we've debugged this in multiple RAG systems, and the core issue is always the same: missing boundary reset or agent sync on patch updates.

we’ve documented this exact failure here, along with tested fixes (both soft fallback and module-level patch): 🔧 https://github.com/onestardao/WFGY/blob/main/ProblemMap/multi-agent-chaos.md

happy to share a lightweight module if you want to patch this locally. just ping us if needed.

Hi onstardao,

Please help to share. I read the post you share, however, I could not find out how to fix this problem in flowise.

Thank you.

noblerboy2004 avatar Aug 09 '25 16:08 noblerboy2004

Is this with just LocalAI embedding? Have you tried other embedding models?

HenryHengZJ avatar Aug 09 '25 16:08 HenryHengZJ

Is this with just LocalAI embedding? Have you tried other embedding models?

Hi HenryHengZJ,

I tried only with LocalAI embedding. I tried multilingual-e5-large-instruct.gguf, all-MiniLM-L6-v2.Q8_0.gguf, bge-m3-Q8_0.gguf. multilingual-e5-large-instruct.gguf, bge-m3-Q8_0.gguf show error Error: ID undefined not found. all-MiniLM-L6-v2.Q8_0.gguf could work but always return same result of Retrieval Playground (Test your vector store retrieval settings) with any query. -> means faiss not working correctly.

Thank you.

noblerboy2004 avatar Aug 09 '25 16:08 noblerboy2004

If you just need the Flowise-specific steps to fix it, here’s the quickest path:

Stop the Flowise server.

In your Flowise data folder, delete the FAISS index directory for that flow (and any mapping files).

Restart Flowise.

Re-upload your documents, but stick to one embedding model (no mixing dims/models).

Test in the Retrieval Playground with 2-3 very different queries to confirm unique results.

This removes the corrupted {doc_id ↔ faiss_row_id} mapping and forces a clean rebuild.

onestardao avatar Aug 10 '25 03:08 onestardao

If you just need the Flowise-specific steps to fix it, here’s the quickest path:

Stop the Flowise server.

In your Flowise data folder, delete the FAISS index directory for that flow (and any mapping files).

Restart Flowise.

Re-upload your documents, but stick to one embedding model (no mixing dims/models).

Test in the Retrieval Playground with 2-3 very different queries to confirm unique results.

This removes the corrupted {doc_id ↔ faiss_row_id} mapping and forces a clean rebuild.

Hi onestardao,

I tried to build docker from dockerfile and try step above. But still the same.

Image Image

Thank you.

noblerboy2004 avatar Aug 11 '25 02:08 noblerboy2004

This is Problem No.7 (FAISS index mapping collapse and rebuild), one of the most common vector store failure types in RAG setups. The full fix and prevention steps are documented here (MIT-licensed, with real implementation examples): WFGY ProblemMap →

Following that guide will prevent recurring issues like ID undefined, mapping loss, and the need to rebuild the index after every restart.

onestardao avatar Aug 11 '25 03:08 onestardao

This is Problem No.7 (FAISS index mapping collapse and rebuild), one of the most common vector store failure types in RAG setups. The full fix and prevention steps are documented here (MIT-licensed, with real implementation examples): WFGY ProblemMap →

Following that guide will prevent recurring issues like ID undefined, mapping loss, and the need to rebuild the index after every restart.

Hi onestardao,

Still confused how to fix problem. Would you please show me how to fix in flowise?

Thank you.

noblerboy2004 avatar Aug 12 '25 06:08 noblerboy2004

It’s ok if the doc feels like a lot — you don’t have to read everything in one go. Here’s the easiest way to fix this in Flowise without getting lost:

Download TXT OS (MIT-licensed, same engine from the Problem Map).

Open any AI chat you already use, upload TXT OS, and just ask it: “Use the WFGY module inside TXT OS to fix Problem No.7 (FAISS index mapping collapse) for my Flowise setup.”

The AI will walk you through the exact steps for your environment.

If anything’s unclear or still fails, post the AI’s response here — I can point out what to adjust.

You don’t need to memorise the whole doc — the AI can extract the relevant part for you, and I can help fine-tune from there.

Problem Map link for reference: https://github.com/onestardao/WFGY/blob/main/ProblemMap/README.md

onestardao avatar Aug 12 '25 09:08 onestardao

Problem still happend in version 3.0.7. Anyone could help?

noblerboy2004 avatar Oct 02 '25 07:10 noblerboy2004