Can't load a PDF document to try out RAG
Installed transformerlab today and wanted to try out RAG, so I installed the llamaindex_simple_document_search plugin.
But if I go to either Interact > Query Docs or Documents and try to upload a PDF file, I see a loading spinner and nothing happens.
Hi, We have fixed this one. The issue usually happens when someone uploads a document without creating any folders. This should be solved in the next release.
If you'd like to try before the official release, you can try building it using a clone of the app and api. Please let me know if you still face issues post this fix
Thanks, I updated my application today and now I'm able to upload a single document without creating a folder first.
But now when I use the Interact > Query Docs feature and type in my question for the document, hitting the "send" button doesn't seem to do anything.
Hey,
We made a few more changes to Transformer Lab. If you're able to, could you please reinstall the RAG plugin and try.
The way how RAG works now is that it only looks for documents within a sub-folder called rag. That is why when you open the Query Docs (RAG) window within the Interact tab, you'll see the Documents preview on the side pointed towards a folder called rag. It is probable that you uploaded the document outside of this and that's why nothing has been found by the model to answer and it errors out because of that. Could you let me know if you see any documents in the Query Docs tab? If not, maybe try uploading one there?
Uploading documents to a folder called rag within the Documents tab should also work.
We're in the process of making rag more flexible such that in the future you can tag which folders to index and then perform RAG so that not everything gets indexed
OK, thanks for the update. I updated Transformer Lab again, but still didn't see a folder called rag under my documents.
I created one myself and then uploaded my document there. Will users have to always create a directory before uploading their first document? There's nothing in the app currently to hint that that's the case.
Anyways, with my document there I can now ask a question and see some loading that seems to be happening - but then my selected model seems to crash and I see the following error:
Settings: {'_llm': None, '_embed_model': None, '_callback_manager': None, '_tokenizer': None, '_node_parser': SentenceSplitter(include_metadata=True, include_prev_next_rel=True, callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x7f2f388f18d0>, id_func=<function default_id_func at 0x7f3038a474c0>, chunk_size=512, chunk_overlap=50, separator=' ', paragraph_separator='\n\n\n', secondary_chunking_regex='[^,.;。?!]+[,.;。?!]?'), '_prompt_helper': PromptHelper(context_window=4096, num_output=256, chunk_overlap_ratio=0.1, chunk_size_limit=None, separator=' '), '_transformations': None}
Loaded 8 docsTraceback (most recent call last):
File "/home/matt/.transformerlab/src/transformerlab/plugin_sdk/plugin_harness.py", line 39, in <module>
main.main()
File "/home/matt/.transformerlab/workspace/plugins/llamaindex_simple_document_search/main.py", line 126, in main
rag_response = query_engine.query(args.query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/base/base_query_engine.py", line 52, in query
query_result = self._query(str_or_query_bundle)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 179, in _query
response = self._response_synthesizer.synthesize(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/base.py", line 241, in synthesize
response_str = self.get_response(
^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/compact_and_refine.py", line 43, in get_response
return super().get_response(
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/refine.py", line 179, in get_response
response = self._give_response_single(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/refine.py", line 241, in _give_response_single
program(
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/refine.py", line 85, in __call__
answer = self._llm.predict(
^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/llms/llm.py", line 605, in predict
chat_response = self.chat(messages)
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai_like/base.py", line 117, in chat
return super().chat(messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py", line 173, in wrapped_llm_chat
f_return_val = f(_self, messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 374, in chat
return chat_fn(messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 107, in wrapper
return retry(f)(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 336, in wrapped_f
return copy(f, *args, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 475, in __call__
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 376, in iter
result = action(retry_state)
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 398, in <lambda>
self._add_action_func(lambda rs: rs.outcome.result())
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 478, in __call__
result = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 470, in _chat
response = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_utils/_utils.py", line 279, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/resources/chat/completions.py", line 863, in create
return self._post(
^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 1283, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 960, in request
return self._request(
^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 1005, in _request
return self._retry_request(
^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 1098, in _retry_request
return self._request(
^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 1064, in _request
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': 'Expected model: . Your model: DeepSeek-R1-Distill-Qwen-1.5B', 'code': 40301}
Also it's not clear from the UI currently what document I'm using to query against. Do I have to open the new rag folder in the file listing if I want to use documents there?
@mcharters To answer your initial questions, you can upload first documents without the need to create any directories. To use a specific document for rag and index it you should install the rag plugin, go to the rag plugin section in the Interact tab and upload any document there. It will automatically create a folder called rag and do the needful reindexing. Any document you upload in that folder from the documents tab or from within rag from then on will be indexed.
The error that you pasted somehow shows that the model crashed before answering, would it be possible for you to maybe try with a smaller model incase this is a system constraint?
I will also do a fresh install on my Linux and MacOS systems to try and replicate this.
OK, there were a couple of issues here - turns out I didn't have the most recent version of the app due to a Windows auto-update issue. I now have version 0.11 and tried RAG again with DeepSeek. I got an error message but couldn't actually see it because the error was blocked by a modal that said "no model running" or something similar.
I took your advice and tried a smaller model (TinyLlama) and here's the issue I'm getting now:
Settings: {'_llm': None, '_embed_model': None, '_callback_manager': None, '_tokenizer': None, '_node_parser': SentenceSplitter(include_metadata=True, include_prev_next_rel=True, callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x7f7bb7c8c590>, id_func=<function default_id_func at 0x7f7cb781b4c0>, chunk_size=512, chunk_overlap=50, separator=' ', paragraph_separator='\n\n\n', secondary_chunking_regex='[^,.;。?!]+[,.;。?!]?'), '_prompt_helper': PromptHelper(context_window=4096, num_output=256, chunk_overlap_ratio=0.1, chunk_size_limit=None, separator=' '), '_transformations': None}
Loaded 8 docsTraceback (most recent call last):
File "/home/matt/.transformerlab/src/transformerlab/plugin_sdk/plugin_harness.py", line 39, in <module>
main.main()
File "/home/matt/.transformerlab/workspace/plugins/llamaindex_simple_document_search/main.py", line 126, in main
rag_response = query_engine.query(args.query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/base/base_query_engine.py", line 52, in query
query_result = self._query(str_or_query_bundle)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 179, in _query
response = self._response_synthesizer.synthesize(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/base.py", line 241, in synthesize
response_str = self.get_response(
^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/compact_and_refine.py", line 43, in get_response
return super().get_response(
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/refine.py", line 179, in get_response
response = self._give_response_single(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/refine.py", line 241, in _give_response_single
program(
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/refine.py", line 85, in __call__
answer = self._llm.predict(
^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/llms/llm.py", line 605, in predict
chat_response = self.chat(messages)
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai_like/base.py", line 117, in chat
return super().chat(messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py", line 173, in wrapped_llm_chat
f_return_val = f(_self, messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 374, in chat
return chat_fn(messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 107, in wrapper
return retry(f)(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 336, in wrapped_f
return copy(f, *args, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 475, in __call__
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 376, in iter
result = action(retry_state)
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 398, in <lambda>
self._add_action_func(lambda rs: rs.outcome.result())
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 478, in __call__
result = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 470, in _chat
response = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_utils/_utils.py", line 279, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/resources/chat/completions.py", line 863, in create
return self._post(
^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 1283, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 960, in request
return self._request(
^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 1064, in _request
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': "This model's maximum context length is 2048 tokens. However, you requested 2602 tokens (2090 in the messages, 512 in the completion). Please reduce the length of the messages or completion.", 'code': 40303}
My query wasn't too many tokens but I guess my query plus the RAG data was? But there's no way as a user to see or do anything about that.
Hmmm...looks like we need something somewhere to catch this and return as an error so the app can display this to the user. Alternatively...the app should probably know when it's going to blow through the context size and stop it maybe.
But yeah you need a model with a larger context window to use RAG generally. The newer models all have bigger context size and most have small versions (Llama 3, Qwen 2.5 are both good).
OK using unsloth/Llama-3.2-1B-Instruct I was able to get one response using RAG, but most of the time the model still crashes.
One time it crashed I was able to see the error and it was the same as the DeepSeek error above, but named the Llama model instead.
I've also seen the model crash, then seemingly come back online and then I get a repsonse.
I'm running on a pretty puny 4 GB GPU on my laptop so maybe RAG is just too much to ask? :) Regular chat seems to work OK.
Hey @mcharters, Just writing back to you since we added some new things. Transformer Lab now uses markitdown as the document processing engine. You can upload a lot more and also process more complex pdfs as we first convert everything to markdown and then provide that with RAG. I would love to hear your experience on the same if you managed to get a model running with your GPU later on?
OK I updated Transformer Lab and tried a query with an existing doc in my library. I got an error:
Settings: {'_llm': None, '_embed_model': None, '_callback_manager': None, '_tokenizer': None, '_node_parser': SentenceSplitter(include_metadata=True, include_prev_next_rel=True, callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x7f01707a7210>, id_func=<function default_id_func at 0x7f0171d83a60>, chunk_size=256, chunk_overlap=50, separator=' ', paragraph_separator='\n\n\n', secondary_chunking_regex='[^,.;。?!]+[,.;。?!]?'), '_prompt_helper': PromptHelper(context_window=2048, num_output=256, chunk_overlap_ratio=0.1, chunk_size_limit=None, separator=' '), '_transformations': None}
Loaded 8 docsTraceback (most recent call last):
File "/home/matt/.transformerlab/src/transformerlab/plugin_sdk/plugin_harness.py", line 39, in <module>
main.main()
File "/home/matt/.transformerlab/workspace/plugins/llamaindex_simple_document_search/main.py", line 148, in main
rag_response = query_engine.query(args.query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/base/base_query_engine.py", line 52, in query
query_result = self._query(str_or_query_bundle)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 178, in _query
nodes = self.retrieve(query_bundle)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 133, in retrieve
nodes = self._retriever.retrieve(query_bundle)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/base/base_retriever.py", line 245, in retrieve
nodes = self._retrieve(query_bundle)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/indices/vector_store/retrievers/retriever.py", line 103, in _retrieve
return self._get_nodes_with_embeddings(query_bundle)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/indices/vector_store/retrievers/retriever.py", line 180, in _get_nodes_with_embeddings
query_result = self._vector_store.query(query, **self._kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/vector_stores/simple.py", line 376, in query
top_similarities, top_ids = get_top_k_embeddings(
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/indices/query/embedding_utils.py", line 30, in get_top_k_embeddings
similarity = similarity_fn(query_embedding_np, emb) # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/base/embeddings/base.py", line 62, in similarity
product = np.dot(embedding1, embedding2)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
ValueError: shapes (768,) and (384,) not aligned: 768 (dim 0) != 384 (dim 0)
I tried deleting the file and adding a different one, which created a new .tlab_markitdown file in my Documents/rag folder.
I tried another query and got a similar error. Still using unsloth/Llama-3.2-1B-Instruct.
This seems quite weird, I think somehow the embedding models on your older embedding and this newer one got mixed up. Would it all be possible for your to delete the "persist" directory in your rag folder and re-index everything once?
Hmm, I deleted my rag directory, started a model and went to the rag query, added my document back and then got this:
Settings: {'_llm': None, '_embed_model': None, '_callback_manager': None, '_tokenizer': None, '_node_parser': SentenceSplitter(include_metadata=True, include_prev_next_rel=True, callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x7fa7ae9242d0>, id_func=<function default_id_func at 0x7fa7aef03a60>, chunk_size=256, chunk_overlap=50, separator=' ', paragraph_separator='\n\n\n', secondary_chunking_regex='[^,.;。?!]+[,.;。?!]?'), '_prompt_helper': PromptHelper(context_window=2048, num_output=256, chunk_overlap_ratio=0.1, chunk_size_limit=None, separator=' '), '_transformations': None}
Loaded 15 docsTraceback (most recent call last):
File "/home/matt/.transformerlab/src/transformerlab/plugin_sdk/plugin_harness.py", line 39, in <module>
main.main()
File "/home/matt/.transformerlab/workspace/plugins/llamaindex_simple_document_search/main.py", line 134, in main
storage_context = StorageContext.from_defaults(persist_dir=persistency_dir)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/storage/storage_context.py", line 111, in from_defaults
docstore = docstore or SimpleDocumentStore.from_persist_dir(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/storage/docstore/simple_docstore.py", line 57, in from_persist_dir
return cls.from_persist_path(persist_path, namespace=namespace, fs=fs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/storage/docstore/simple_docstore.py", line 74, in from_persist_path
simple_kvstore = SimpleKVStore.from_persist_path(persist_path, fs=fs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/storage/kvstore/simple_kvstore.py", line 97, in from_persist_path
with fs.open(persist_path, "rb") as f:
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/fsspec/spec.py", line 1303, in open
f = self._open(
^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/fsspec/implementations/local.py", line 195, in _open
return LocalFileOpener(path, mode, fs=self, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/fsspec/implementations/local.py", line 359, in __init__
self._open()
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/fsspec/implementations/local.py", line 364, in _open
self.f = open(self.path, mode=self.mode)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
FileNotFoundError: [Errno 2] No such file or directory: '/home/matt/.transformerlab/workspace/experiments/alpha/documents/rag/persist/docstore.json'
So yeah, it does look like somethings still looking for the old "persist" folder.
Could you once try clicking the "Reindex" button? I think it didn't reindex automatically for some reason which I will check out
Edit: I checked the auto indexing on RAG and it was broken currently where uploading a document did not automatically re-index documents on its own. I'll get a fix on this and this should be good in the next build. Currently manually clicking on the re-index button would create a new persist folder for you
OK, that created the persist folder, but now when I ask a question it looks like maybe the model gets stuck replying or something. My GPU gets pegged at 100% and I get this error:
Settings: {'_llm': None, '_embed_model': None, '_callback_manager': None, '_tokenizer': None, '_node_parser': SentenceSplitter(include_metadata=True, include_prev_next_rel=True, callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x7f6e4bd31450>, id_func=<function default_id_func at 0x7f6f4bfb3a60>, chunk_size=256, chunk_overlap=50, separator=' ', paragraph_separator='\n\n\n', secondary_chunking_regex='[^,.;。?!]+[,.;。?!]?'), '_prompt_helper': PromptHelper(context_window=2048, num_output=256, chunk_overlap_ratio=0.1, chunk_size_limit=None, separator=' '), '_transformations': None}
Loaded 15 docsTraceback (most recent call last):
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpx/_transports/default.py", line 101, in map_httpcore_exceptions
yield
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpx/_transports/default.py", line 250, in handle_request
resp = self._pool.handle_request(req)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py", line 256, in handle_request
raise exc from None
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpcore/_sync/connection_pool.py", line 236, in handle_request
response = connection.handle_request(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpcore/_sync/connection.py", line 103, in handle_request
return self._connection.handle_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpcore/_sync/http11.py", line 136, in handle_request
raise exc
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpcore/_sync/http11.py", line 106, in handle_request
) = self._receive_response_headers(**kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpcore/_sync/http11.py", line 177, in _receive_response_headers
event = self._receive_event(timeout=timeout)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpcore/_sync/http11.py", line 217, in _receive_event
data = self._network_stream.read(
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpcore/_backends/sync.py", line 126, in read
with map_exceptions(exc_map):
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/contextlib.py", line 158, in __exit__
self.gen.throw(typ, value, traceback)
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
raise to_exc(exc) from exc
httpcore.ReadTimeout: timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 955, in _request
response = self._client.send(
^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpx/_client.py", line 914, in send
response = self._send_handling_auth(
^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpx/_client.py", line 942, in _send_handling_auth
response = self._send_handling_redirects(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpx/_client.py", line 979, in _send_handling_redirects
response = self._send_single_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpx/_client.py", line 1014, in _send_single_request
response = transport.handle_request(request)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpx/_transports/default.py", line 249, in handle_request
with map_httpcore_exceptions():
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/contextlib.py", line 158, in __exit__
self.gen.throw(typ, value, traceback)
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/httpx/_transports/default.py", line 118, in map_httpcore_exceptions
raise mapped_exc(message) from exc
httpx.ReadTimeout: timed out
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/matt/.transformerlab/src/transformerlab/plugin_sdk/plugin_harness.py", line 39, in <module>
main.main()
File "/home/matt/.transformerlab/workspace/plugins/llamaindex_simple_document_search/main.py", line 148, in main
rag_response = query_engine.query(args.query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/base/base_query_engine.py", line 52, in query
query_result = self._query(str_or_query_bundle)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 179, in _query
response = self._response_synthesizer.synthesize(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/base.py", line 241, in synthesize
response_str = self.get_response(
^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/compact_and_refine.py", line 43, in get_response
return super().get_response(
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/refine.py", line 179, in get_response
response = self._give_response_single(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/refine.py", line 241, in _give_response_single
program(
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/refine.py", line 85, in __call__
answer = self._llm.predict(
^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/llms/llm.py", line 605, in predict
chat_response = self.chat(messages)
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai_like/base.py", line 117, in chat
return super().chat(messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py", line 173, in wrapped_llm_chat
f_return_val = f(_self, messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 374, in chat
return chat_fn(messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 107, in wrapper
return retry(f)(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 336, in wrapped_f
return copy(f, *args, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 475, in __call__
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 376, in iter
result = action(retry_state)
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 418, in exc_check
raise retry_exc.reraise()
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 185, in reraise
raise self.last_attempt.result()
^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 478, in __call__
result = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 470, in _chat
response = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_utils/_utils.py", line 279, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 914, in create
return self._post(
^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 1242, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 919, in request
return self._request(
^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 964, in _request
return self._retry_request(
^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 1057, in _retry_request
return self._request(
^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 964, in _request
return self._retry_request(
^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 1057, in _retry_request
return self._request(
^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 964, in _request
return self._retry_request(
^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 1057, in _retry_request
return self._request(
^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 974, in _request
raise APITimeoutError(request=request) from err
openai.APITimeoutError: Request timed out.
Hmm this means there is not enough space on the GPU to handle everything as it just sends your query and two of the most relevant contexts from your RAG data. It maybe because of the size of the GPU. The model you're using itself takes up a bit more than 2.3 GB of GPU
I tried switching to GGUF to see if faster inference would help my machine not get stuck. I'm getting much faster response times for chat, but now I get a different error when doing RAG:
Settings: {'_llm': None, '_embed_model': None, '_callback_manager': None, '_tokenizer': None, '_node_parser': SentenceSplitter(include_metadata=True, include_prev_next_rel=True, callback_manager=<llama_index.core.callbacks.base.CallbackManager object at 0x7f6ff07113d0>, id_func=<function default_id_func at 0x7f70f0adba60>, chunk_size=256, chunk_overlap=50, separator=' ', paragraph_separator='\n\n\n', secondary_chunking_regex='[^,.;。?!]+[,.;。?!]?'), '_prompt_helper': PromptHelper(context_window=2048, num_output=256, chunk_overlap_ratio=0.1, chunk_size_limit=None, separator=' '), '_transformations': None}
Loaded 15 docsTraceback (most recent call last):
File "/home/matt/.transformerlab/src/transformerlab/plugin_sdk/plugin_harness.py", line 39, in <module>
main.main()
File "/home/matt/.transformerlab/workspace/plugins/llamaindex_simple_document_search/main.py", line 148, in main
rag_response = query_engine.query(args.query)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/base/base_query_engine.py", line 52, in query
query_result = self._query(str_or_query_bundle)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 179, in _query
response = self._response_synthesizer.synthesize(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/base.py", line 241, in synthesize
response_str = self.get_response(
^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/compact_and_refine.py", line 43, in get_response
return super().get_response(
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/refine.py", line 179, in get_response
response = self._give_response_single(
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/refine.py", line 241, in _give_response_single
program(
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/response_synthesizers/refine.py", line 85, in __call__
answer = self._llm.predict(
^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/llms/llm.py", line 605, in predict
chat_response = self.chat(messages)
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai_like/base.py", line 117, in chat
return super().chat(messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/instrumentation/dispatcher.py", line 322, in wrapper
result = func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/core/llms/callbacks.py", line 173, in wrapped_llm_chat
f_return_val = f(_self, messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 374, in chat
return chat_fn(messages, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 107, in wrapper
return retry(f)(self, *args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 336, in wrapped_f
return copy(f, *args, **kw)
^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 475, in __call__
do = self.iter(retry_state=retry_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 376, in iter
result = action(retry_state)
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 398, in <lambda>
self._add_action_func(lambda rs: rs.outcome.result())
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/concurrent/futures/_base.py", line 449, in result
return self.__get_result()
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
raise self._exception
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/tenacity/__init__.py", line 478, in __call__
result = fn(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/llama_index/llms/openai/base.py", line 470, in _chat
response = client.chat.completions.create(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_utils/_utils.py", line 279, in wrapper
return func(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/resources/chat/completions/completions.py", line 914, in create
return self._post(
^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 1242, in post
return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 919, in request
return self._request(
^^^^^^^^^^^^^^
File "/home/matt/.transformerlab/envs/transformerlab/lib/python3.11/site-packages/openai/_base_client.py", line 1023, in _request
raise self._make_status_error_from_response(err.response) from None
openai.BadRequestError: Error code: 400 - {'object': 'error', 'message': 'Expecting value: line 1 column 1 (char 0)', 'code': 50001}
Yeah its the same issue under a different error. Your model worker gets killed because of the memory issues
Hi there, I'm trying RAG again, this time with a much larger GPU. I'm getting what looks like a code error:
Traceback (most recent call last):
File "/root/.transformerlab/src/transformerlab/plugin_sdk/plugin_harness.py", line 26, in <module>
import main
File "/root/.transformerlab/workspace/plugins/llamaindex_simple_document_search/main.py", line 4, in <module>
from llama_index.llms.openai_like import OpenAILike
File "/root/.transformerlab/workspace/plugins/llamaindex_simple_document_search/venv/lib/python3.11/site-packages/llama_index/llms/openai_like/__init__.py", line 1, in <module>
from llama_index.llms.openai_like.base import OpenAILike
File "/root/.transformerlab/workspace/plugins/llamaindex_simple_document_search/venv/lib/python3.11/site-packages/llama_index/llms/openai_like/base.py", line 20, in <module>
from llama_index.llms.openai.base import OpenAI, Tokenizer
File "/root/.transformerlab/workspace/plugins/llamaindex_simple_document_search/venv/lib/python3.11/site-packages/llama_index/llms/openai/__init__.py", line 2, in <module>
from llama_index.llms.openai.responses import OpenAIResponses
File "/root/.transformerlab/workspace/plugins/llamaindex_simple_document_search/venv/lib/python3.11/site-packages/llama_index/llms/openai/responses.py", line 7, in <module>
from openai.types.responses import (
ImportError: cannot import name 'ResponseOutputTextAnnotationAddedEvent' from 'openai.types.responses' (/root/.transformerlab/workspace/plugins/llamaindex_simple_document_search/venv/lib/python3.11/site-packages/openai/types/responses/__init__.py)
I'm getting the same error
ImportError: cannot import name 'ResponseOutputTextAnnotationAddedEvent' from 'openai.types.responses'
on macOS.
The plugin has a pinned version of the openai module, while the llama-index* modules are not pinned.
Removing the pinned version of openai in setup.sh solves the issue at least for RAG of text documents
Hi thanks for reporting this! We will fix this asap and also post the new setup.sh so you can continue using this until we do a release
The PR is open (transformerlab/transformerlab-api#329) for a fix but here is what you need to change the setup.sh to:
uv pip install llama-index==0.12.38
uv pip install llama-index-llms-openai-like==0.4.0
uv pip install openai==1.82.1
uv pip install llama-index-embeddings-huggingface==0.5.4
uv pip install cryptography==44.0.2 # needed to read PDFs
This fix was included in the last release (v0.18.0 released on Monday). Closing but please reopen if you find any issues!