Hang when generating datasets
Describe the bug I tried the simplest way to generate the datasets relevant to the specific document, but the progress hung.
Ragas version: 0.1.4 Python version: 3.11
Code to Reproduce from langchain_community.document_loaders import TextLoader
loader = TextLoader("load-statement-data-manipulation-20f83c8.md") documents = loader.load()
for document in documents: document.metadata["filename"] = document.metadata["source"]
from ragas.testset.generator import TestsetGenerator from ragas.testset.evolutions import simple, reasoning, multi_context, conditional
gen_ai_llm = ChatOpenAI(model_name='gpt-4-32k', temperature=0.0) embeddings = OpenAIEmbeddings(model_name='text-embedding-ada-002')
generator_llm = gen_ai_llm cirtic_llm = gen_ai_llm
generator = TestsetGenerator.from_langchain(generator_llm, cirtic_llm, embeddings)
testset = generator.generate_with_langchain_docs(documents, with_debugging_logs=True, test_size=5, distributions={simple: 1.0})
print(testset.to_pandas())
Error trace Generating: 0%| | 0/5 [00:00<?, ?it/s][ragas.testset.filters.DEBUG] node filter: {'score': 8.5} [ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['LOAD Statement in data manipulation', 'Examples and syntax of LOAD Statement'] [ragas.testset.filters.DEBUG] node filter: {'score': 8.5} [ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['LOAD Statement in data manipulation', 'Examples and syntax of LOAD Statement'] [ragas.testset.filters.DEBUG] node filter: {'score': 8.5} [ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['LOAD Statement in data manipulation', 'Examples and syntax of LOAD Statement'] [ragas.testset.filters.DEBUG] node filter: {'score': 8.5} [ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['LOAD Statement in data manipulation', 'Examples and syntax of LOAD Statement'] [ragas.testset.evolutions.INFO] seed question generated: "What are some examples of how to use the LOAD statement in data manipulation?" [ragas.testset.evolutions.INFO] seed question generated: "What are some examples of how to use the LOAD statement in data manipulation?" [ragas.testset.evolutions.INFO] seed question generated: "What are some examples of how to use the LOAD statement in data manipulation?" [ragas.testset.evolutions.INFO] seed question generated: "What are some examples of how to use the LOAD statement in data manipulation?" [ragas.testset.filters.DEBUG] filtered question: {'feedback': "The question is clear in its intent, asking for examples of how to use the 'LOAD' statement in data manipulation. It specifies the topic of interest (the 'LOAD' statement) and the context (data manipulation), and it seeks practical examples, making the type of answer it seeks clear. The question is independent and does not rely on external references or specific documents. It is understandable and answerable for those with knowledge in data manipulation and the use of the 'LOAD' statement.", 'verdict': '1'} [ragas.testset.filters.DEBUG] filtered question: {'feedback': "The question is clear and specific, asking for examples of how to use the 'LOAD' statement in data manipulation. It specifies the topic of interest (the 'LOAD' statement) and the context (data manipulation), and it is clear in its intent to seek practical examples. This question can be answered by someone with knowledge in data manipulation and the use of the 'LOAD' statement, without needing additional context or external references.", 'verdict': '1'} [ragas.testset.filters.DEBUG] filtered question: {'feedback': "The question is clear in its intent, asking for examples of how to use the 'LOAD' statement in data manipulation. It specifies the topic of interest (the 'LOAD' statement) and the context (data manipulation), and it seeks a specific type of information (examples). This makes the question understandable and answerable for those with knowledge in data manipulation and SQL or similar languages where the 'LOAD' statement is used. The question could be made even clearer by specifying the programming language or database system of interest, as the use of the 'LOAD' statement can vary across different systems.", 'verdict': '1'} [ragas.testset.evolutions.DEBUG] answer generated: {'answer': "Some examples of using the LOAD statement in data manipulation include loading a table into memory with 'LOAD A all;', loading specific columns of a table into memory with 'LOAD A (A,B);', and querying the load status of a table with 'SELECT loaded FROM m_cs_tables WHERE table_name = 'A';'.", 'verdict': '1'} Generating: 20%|██ | 1/5 [00:21<01:26, 21.51s/it][ragas.testset.filters.DEBUG] filtered question: {'feedback': "The question is clear in its intent, asking for examples of how to use the 'LOAD' statement in data manipulation. It specifies the topic of interest (the 'LOAD' statement) and the context (data manipulation), and it seeks a specific type of information (examples). This makes the question understandable and answerable for those with knowledge in data manipulation and SQL or similar languages where a 'LOAD' statement might be used. However, to improve clarity, the question could specify the programming language or database system of interest, as the use of the 'LOAD' statement can vary across different systems.", 'verdict': '1'} [ragas.testset.evolutions.DEBUG] answer generated: {'answer': "Some examples of using the LOAD statement in data manipulation include loading a table into memory with 'LOAD A all;', loading specific columns of a table into memory with 'LOAD A (A,B);', and querying the load status of a table with 'SELECT loaded FROM m_cs_tables WHERE table_name = 'A';'.", 'verdict': '1'} Generating: 40%|████ | 2/5 [00:30<00:42, 14.20s/it][ragas.testset.evolutions.DEBUG] answer generated: {'answer': "Some examples of using the LOAD statement in data manipulation include loading a table into memory with 'LOAD A all;', loading specific columns of a table into memory with 'LOAD A (A,B);', and querying the load status of a table with 'SELECT loaded FROM m_cs_tables WHERE table_name = 'A';'.", 'verdict': '1'} Generating: 60%|██████ | 3/5 [00:32<00:17, 8.75s/it][ragas.testset.evolutions.DEBUG] answer generated: {'answer': "Some examples of using the LOAD statement in data manipulation include loading a table into memory with 'LOAD A all;' or loading specific columns of a table into memory with 'LOAD A (A,B);'. You can also query the load status of a table using the m_cs_tables monitoring view with 'SELECT loaded FROM m_cs_tables WHERE table_name = 'A';'.", 'verdict': '1'} Generating: 80%|████████ | 4/5 [00:38<00:07, 7.64s/it]
I have the same issue with you. Have you handled the problem? I will appreciate it if you relpy me.
@Kevin-JiXu, actually not yet.
is this problem solved? encounter the same issue.
is this problem solved? encounter the same issue.
Unfortunately, I was not able to solve this problem. So I give up using this tool.
@Kevin-JiXu @xiyang-aads-lilly that is unfortunate. with was fixed with #1093. do try it again and let us know if it works for you. if not do post your errors here. I would love to get you folks back in 🙂