DocsGPT
DocsGPT copied to clipboard
🐛 Bug Report: Debugging middleware caught exception in streamed response at a point where response headers were already sent
📜 Description
👟 Reproduction steps
./setup.sh
👍 Expected behavior
succ
👎 Actual Behavior with Screenshots
same
💻 Operating system
MacOS
What browsers are you seeing the problem on?
Chrome
🤖 What development environment are you experiencing this bug on?
Docker
🔒 Did you set the correct environment variables in the right path? List the environment variable names (not values please!)
No response
📃 Provide any additional context for the Bug.
No response
📖 Relevant log output
No response
👀 Have you spent some time to check if this bug has been raised before?
- [X] I checked and didn't find similar issue
🔗 Are you willing to submit PR?
None
🧑⚖️ Code of Conduct
- [X] I agree to follow this project's Code of Conduct
I encountered the same issue. I have created the venv and run the ./setup.sh inside the venv, every service runs successfully on docker, but when submitting a query the error happens. I use python 3.10.
Did you choose option 1 or 2? please note it might take a while for a response too.
I used option 1.
Any error trace in your console?
Here is the error trace:
llama_new_context_with_model: n_ctx = 2048
llama_new_context_with_model: freq_base = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_new_context_with_model: kv self size = 1024.00 MB
llama_new_context_with_model: compute buffer total size = 161.88 MB
AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 0 | ARM_FMA = 0 | F16C = 0 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
Debugging middleware caught exception in streamed response at a point where response headers were already sent.
Traceback (most recent call last):
File "/Users/user1/Documents/Github/DocsGPT/venv/lib/python3.11/site-packages/werkzeug/wsgi.py", line 256, in __next__
return self._next()
^^^^^^^^^^^^
File "/Users/user1/Documents/Github/DocsGPT/venv/lib/python3.11/site-packages/werkzeug/wrappers/response.py", line 32, in _iter_encoded
for item in iterable:
File "/Users/user1/Documents/Github/DocsGPT/application/api/answer/routes.py", line 120, in complete_stream
docs = docsearch.search(question, k=2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user1/Documents/Github/DocsGPT/application/vectorstore/faiss.py", line 20, in search
return self.docsearch.similarity_search(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user1/Documents/Github/DocsGPT/venv/lib/python3.11/site-packages/langchain/vectorstores/faiss.py", line 334, in similarity_search
docs_and_scores = self.similarity_search_with_score(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user1/Documents/Github/DocsGPT/venv/lib/python3.11/site-packages/langchain/vectorstores/faiss.py", line 276, in similarity_search_with_score
docs = self.similarity_search_with_score_by_vector(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user1/Documents/Github/DocsGPT/venv/lib/python3.11/site-packages/langchain/vectorstores/faiss.py", line 219, in similarity_search_with_score_by_vector
scores, indices = self.index.search(vector, k if filter is None else fetch_k)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/user1/Documents/Github/DocsGPT/venv/lib/python3.11/site-packages/faiss/class_wrappers.py", line 329, in replacement_search
assert d == self.d
^^^^^^^^^^^
AssertionError
127.0.0.1 - - [11/Oct/2023 14:54:13] "POST /stream HTTP/1.1" 200 -
Encountered the same issue on MacOs, when running ./setup.sh (with python in venv) -> Option 1. Got this stack trace:
...................................................................................................
llama_new_context_with_model: n_ctx = 2048
llama_new_context_with_model: freq_base = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_new_context_with_model: kv self size = 1024.00 MB
llama_new_context_with_model: compute buffer total size = 161.88 MB
AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 |
Shape of query vector: (1, 768)
Query vector dimension: 768
Faiss index dimension: 1536
Debugging middleware caught exception in streamed response at a point where response headers were already sent.
Traceback (most recent call last):
File "/Users/axel/repos/DocsGPT/venv/lib/python3.11/site-packages/werkzeug/wsgi.py", line 256, in __next__
return self._next()
^^^^^^^^^^^^
File "/Users/axel/repos/DocsGPT/venv/lib/python3.11/site-packages/werkzeug/wrappers/response.py", line 32, in _iter_encoded
for item in iterable:
File "/Users/axel/repos/DocsGPT/application/api/answer/routes.py", line 120, in complete_stream
docs = docsearch.search(question, k=2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/axel/repos/DocsGPT/application/vectorstore/faiss.py", line 20, in search
return self.docsearch.similarity_search(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/axel/repos/DocsGPT/venv/lib/python3.11/site-packages/langchain/vectorstores/faiss.py", line 334, in similarity_search
docs_and_scores = self.similarity_search_with_score(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/axel/repos/DocsGPT/venv/lib/python3.11/site-packages/langchain/vectorstores/faiss.py", line 276, in similarity_search_with_score
docs = self.similarity_search_with_score_by_vector(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/axel/repos/DocsGPT/venv/lib/python3.11/site-packages/langchain/vectorstores/faiss.py", line 219, in similarity_search_with_score_by_vector
scores, indices = self.index.search(vector, k if filter is None else fetch_k)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/axel/repos/DocsGPT/venv/lib/python3.11/site-packages/faiss/class_wrappers.py", line 333, in replacement_search
assert d == self.d
^^^^^^^^^^^
AssertionError
127.0.0.1 - - [12/Oct/2023 12:54:06] "POST /stream HTTP/1.1" 200 -
When adding some print statements I got these values of d and self.d:
Shape of x: (1, 768)
d: 768
self.d: 1536
And when I just bypassed the assert d == self.d I got this downstream error:
llama_new_context_with_model: n_ctx = 2048
llama_new_context_with_model: freq_base = 10000.0
llama_new_context_with_model: freq_scale = 1
llama_new_context_with_model: kv self size = 1024.00 MB
llama_new_context_with_model: compute buffer total size = 161.88 MB
AVX = 0 | AVX2 = 0 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 0 | NEON = 1 | ARM_FMA = 1 | F16C = 0 | FP16_VA = 1 | WASM_SIMD = 0 | BLAS = 1 | SSE3 = 0 | SSSE3 = 0 | VSX = 0 |
docs:
[]
question:
Hello
Debugging middleware caught exception in streamed response at a point where response headers were already sent.
Traceback (most recent call last):
File "/Users/axel/repos/DocsGPT/venv/lib/python3.11/site-packages/werkzeug/wsgi.py", line 256, in __next__
return self._next()
^^^^^^^^^^^^
File "/Users/axel/repos/DocsGPT/venv/lib/python3.11/site-packages/werkzeug/wrappers/response.py", line 32, in _iter_encoded
for item in iterable:
File "/Users/axel/repos/DocsGPT/application/api/answer/routes.py", line 126, in complete_stream
docs = [docs[0]]
^^^^^^^^
IndexError: list index out of range
Ahhh, please try ingesting your own docuemnts and the asking question about it Basically the pre loaded index will not work, just upload any pdf or doc
It worked when I uploaded a document with the assertion commented out.
However, commenting on the original bug. From what I have gathered so far (and please correct me if I'm wrong):
The embedding function of the docsgpt-7b-f16.gguf model outputs embedding vectors of length 768, which corresponds to the d in the assert d == self.d. However the FAISS vector store seems to be initiated with an index (corresponding to self) that assumes the vector length 1536 for some reason. I haven't quite figured out where this initialization is and what kind of index is passed though.
More than willing to make a PR in case we get to the bottom of this (assuming it doesn't get solved before that)
Update: I think the problem is that in this case we are loading the embedding function from this transformer: huggingface_sentence-transformers/all-mpnet-base-v2 which has embedding length of 768, but the index which is loaded into the FAISS vector store is loaded from the file application/index.faiss which gives us an expected embedding length of 1536.
So basically here:
class FaissStore(BaseVectorStore):
def __init__(self, path, embeddings_key, docs_init=None):
super().__init__()
self.path = path
if docs_init:
self.docsearch = FAISS.from_documents(
docs_init, self._get_embeddings(settings.EMBEDDINGS_NAME, embeddings_key)
)
else:
self.docsearch = FAISS.load_local(
self.path, self._get_embeddings(settings.EMBEDDINGS_NAME, settings.EMBEDDINGS_KEY)
)
when docs_init is False we have the path application/ which leads us to application/index.faiss (d=1536) but the EMBEDDINGS_NAME is huggingface_sentence-transformers/all-mpnet-base-v2 which gives us vectors of length 768, hence the mismatch.
I guess OpenAPI:s embeddings have length 1536, so that's why it would work when using the OpenAI API maybe?
Thank you for the PR!