Ragas Synthetic Test Data Generation error using AzureOpenaiEmbeddings
I was trying to do "test generation" using RAGAS framework with the help of the "https://docs.ragas.io/en/stable/concepts/testset_generation.html", I'm facing error.
Please have a look on the below error.
ragas.exceptions.ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass raise_exceptions=False incase you want to show only a warning message instead.
Task was destroyed but it is pending!
task: <Task pending name='Task-3' coro=<as_completed.
And my code is:
loader = PyPDFLoader("
#used same LLM for both generator_lmm and critic_llm generator_llm = llm() critic_llm = llm() embeddings =embeddings() # used AzureOpenaiEmbeddings
generator = TestsetGenerator.from_langchain( generator_llm, critic_llm, embeddings ) distributions = { simple: 0.5, multi_context: 0.4, reasoning: 0.1 }
testset = generator.generate_with_langchain_docs(documents, 10, distributions) testset.to_pandas()
versions: openai - 1.17.0 ragas - 0.1.4/ 0.1.6
I don't think this is a Ragas issue - are you still facing this @sona-16 ?
Hi jjmachan,
I haven't tried yet on the issue. Will do it by today and update you
I am facing this same issue.
Filename and doc_id are the same for all nodes.
Generating: 52%|█████▎ | 21/40 [02:41<02:26, 7.69s/it]
Exception in thread Thread-21:
Traceback (most recent call last):
File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/threading.py", line 973, in _bootstrap_inner
self.run()
File "/Users/L037301/Documents/GitHub/ragas/src/ragas/executor.py", line 96, in run
results = self.loop.run_until_complete(self._aresults())
File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/asyncio/base_events.py", line 642, in run_until_complete
return future.result()
File "/Users/L037301/Documents/GitHub/ragas/src/ragas/executor.py", line 84, in _aresults
raise e
File "/Users/L037301/Documents/GitHub/ragas/src/ragas/executor.py", line 79, in _aresults
r = await future
File "/Applications/Xcode.app/Contents/Developer/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/asyncio/tasks.py", line 614, in _wait_for_one
return f.result() # May raise f.exception().
File "/Users/L037301/Documents/GitHub/ragas/src/ragas/executor.py", line 38, in sema_coro
return await coro
File "/Users/L037301/Documents/GitHub/ragas/src/ragas/executor.py", line 112, in wrapped_callable_async
return counter, await callable(*args, **kwargs)
File "/Users/L037301/Documents/GitHub/ragas/src/ragas/testset/evolutions.py", line 144, in evolve
return await self.generate_datarow(
File "/Users/L037301/Documents/GitHub/ragas/src/ragas/testset/evolutions.py", line 210, in generate_datarow
selected_nodes = [
File "/Users/L037301/Documents/GitHub/ragas/src/ragas/testset/evolutions.py", line 213, in
if i - 1 < len(current_nodes.nodes)
TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'
ExceptionInRunner Traceback (most recent call last) File /Users/L037301/Documents/GitHub/ragas/src/ragas/tryout.py:1 ----> 1 testset4 = generator.generate_with_langchain_docs(pages4, test_size=40, distributions={simple: 0.5, reasoning: 0.25, multi_context: 0.25})
File ~/Documents/GitHub/ragas/src/ragas/testset/generator.py:179, in TestsetGenerator.generate_with_langchain_docs(self, documents, test_size, distributions, with_debugging_logs, is_async, raise_exceptions, run_config) 174 # chunk documents and add to docstore 175 self.docstore.add_documents( 176 [Document.from_langchain_document(doc) for doc in documents] 177 ) --> 179 return self.generate( 180 test_size=test_size, 181 distributions=distributions, 182 with_debugging_logs=with_debugging_logs, 183 is_async=is_async, 184 raise_exceptions=raise_exceptions, 185 run_config=run_config, 186 )
File ~/Documents/GitHub/ragas/src/ragas/testset/generator.py:274, in TestsetGenerator.generate(self, test_size, distributions, with_debugging_logs, is_async, raise_exceptions, run_config) 272 test_data_rows = exec.results() 273 if not test_data_rows: --> 274 raise ExceptionInRunner() 276 except ValueError as e: 277 raise e
ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass raise_exceptions=False incase you want to show only a warning message instead.
Any update? I'm facing the same message for a ragas.evaluate inside a ThreadPoolExecutor
I also have the same issue, everything worked fine for a set of 300 pdfs, and now all of a sudden the same code gives the error below:
ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass raise_exceptions=False incase you want to show only a warning message instead.
Hi Team,
I'm still facing the same error. I doubt, is this due to huggingface LLM I'm using or due to computation power. And also, Im using google colab notebook with CPU setup to do this task.
Hi @damlitos could you please share me the code snippet, why because I have just used 2 pages of PDF, even though no proper output. As you told you may be getting an satisfactory answer for lesser # of pages.
error:
embedding nodes: 0% 0/4 [02:27<?, ?it/s] Exception in thread Thread-13: Traceback (most recent call last): File "/usr/lib/python3.10/threading.py", line 1016, in _bootstrap_inner self.run() File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 96, in run results = self.loop.run_until_complete(self._aresults()) File "/usr/lib/python3.10/asyncio/base_events.py", line 649, in run_until_complete return future.result() File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 84, in _aresults raise e File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 79, in _aresults r = await future File "/usr/lib/python3.10/asyncio/tasks.py", line 571, in _wait_for_one return f.result() # May raise f.exception(). File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 38, in sema_coro return await coro File "/usr/local/lib/python3.10/dist-packages/ragas/executor.py", line 112, in wrapped_callable_async return counter, await callable(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ragas/embeddings/base.py", line 23, in embed_text embs = await self.embed_texts([text], is_async=is_async) File "/usr/local/lib/python3.10/dist-packages/ragas/embeddings/base.py", line 33, in embed_texts return await aembed_documents_with_retry(texts) File "/usr/local/lib/python3.10/dist-packages/tenacity/_asyncio.py", line 142, in async_wrapped return await fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/tenacity/_asyncio.py", line 58, in call do = await self.iter(retry_state=retry_state) File "/usr/local/lib/python3.10/dist-packages/tenacity/_asyncio.py", line 110, in iter result = await action(retry_state) File "/usr/local/lib/python3.10/dist-packages/tenacity/_asyncio.py", line 78, in inner return fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/tenacity/init.py", line 410, in exc_check raise retry_exc.reraise() File "/usr/local/lib/python3.10/dist-packages/tenacity/init.py", line 183, in reraise raise self.last_attempt.result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 451, in result return self.__get_result() File "/usr/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result raise self._exception File "/usr/local/lib/python3.10/dist-packages/tenacity/_asyncio.py", line 61, in call result = await fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/ragas/embeddings/base.py", line 64, in aembed_documents return await self.embeddings.aembed_documents(texts) File "/usr/local/lib/python3.10/dist-packages/torch/nn/modules/module.py", line 1709, in getattr raise AttributeError(f"'{type(self).name}' object has no attribute '{name}'") AttributeError: 'SentenceTransformer' object has no attribute 'aembed_documents'
ExceptionInRunner Traceback (most recent call last)
2 frames /usr/local/lib/python3.10/dist-packages/ragas/testset/docstore.py in add_nodes(self, nodes, show_progress) 252 results = executor.results() 253 if not results: --> 254 raise ExceptionInRunner() 255 256 for i, n in enumerate(nodes):
ExceptionInRunner: The runner thread which was running the jobs raised an exeception. Read the traceback above to debug it. You can also pass raise_exceptions=False incase you want to show only a warning message instead.