ragas icon indicating copy to clipboard operation
ragas copied to clipboard

Hang when generating datasets

Open xidian237 opened this issue 1 year ago • 2 comments

Describe the bug I tried the simplest way to generate the datasets relevant to the specific document, but the progress hung.

Ragas version: 0.1.4 Python version: 3.11

Code to Reproduce from langchain_community.document_loaders import TextLoader

loader = TextLoader("load-statement-data-manipulation-20f83c8.md") documents = loader.load()

for document in documents: document.metadata["filename"] = document.metadata["source"]

from ragas.testset.generator import TestsetGenerator from ragas.testset.evolutions import simple, reasoning, multi_context, conditional

gen_ai_llm = ChatOpenAI(model_name='gpt-4-32k', temperature=0.0) embeddings = OpenAIEmbeddings(model_name='text-embedding-ada-002')

generator_llm = gen_ai_llm cirtic_llm = gen_ai_llm

generator = TestsetGenerator.from_langchain(generator_llm, cirtic_llm, embeddings)

testset = generator.generate_with_langchain_docs(documents, with_debugging_logs=True, test_size=5, distributions={simple: 1.0})

print(testset.to_pandas())

Error trace Generating: 0%| | 0/5 [00:00<?, ?it/s][ragas.testset.filters.DEBUG] node filter: {'score': 8.5} [ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['LOAD Statement in data manipulation', 'Examples and syntax of LOAD Statement'] [ragas.testset.filters.DEBUG] node filter: {'score': 8.5} [ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['LOAD Statement in data manipulation', 'Examples and syntax of LOAD Statement'] [ragas.testset.filters.DEBUG] node filter: {'score': 8.5} [ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['LOAD Statement in data manipulation', 'Examples and syntax of LOAD Statement'] [ragas.testset.filters.DEBUG] node filter: {'score': 8.5} [ragas.testset.evolutions.DEBUG] keyphrases in merged node: ['LOAD Statement in data manipulation', 'Examples and syntax of LOAD Statement'] [ragas.testset.evolutions.INFO] seed question generated: "What are some examples of how to use the LOAD statement in data manipulation?" [ragas.testset.evolutions.INFO] seed question generated: "What are some examples of how to use the LOAD statement in data manipulation?" [ragas.testset.evolutions.INFO] seed question generated: "What are some examples of how to use the LOAD statement in data manipulation?" [ragas.testset.evolutions.INFO] seed question generated: "What are some examples of how to use the LOAD statement in data manipulation?" [ragas.testset.filters.DEBUG] filtered question: {'feedback': "The question is clear in its intent, asking for examples of how to use the 'LOAD' statement in data manipulation. It specifies the topic of interest (the 'LOAD' statement) and the context (data manipulation), and it seeks practical examples, making the type of answer it seeks clear. The question is independent and does not rely on external references or specific documents. It is understandable and answerable for those with knowledge in data manipulation and the use of the 'LOAD' statement.", 'verdict': '1'} [ragas.testset.filters.DEBUG] filtered question: {'feedback': "The question is clear and specific, asking for examples of how to use the 'LOAD' statement in data manipulation. It specifies the topic of interest (the 'LOAD' statement) and the context (data manipulation), and it is clear in its intent to seek practical examples. This question can be answered by someone with knowledge in data manipulation and the use of the 'LOAD' statement, without needing additional context or external references.", 'verdict': '1'} [ragas.testset.filters.DEBUG] filtered question: {'feedback': "The question is clear in its intent, asking for examples of how to use the 'LOAD' statement in data manipulation. It specifies the topic of interest (the 'LOAD' statement) and the context (data manipulation), and it seeks a specific type of information (examples). This makes the question understandable and answerable for those with knowledge in data manipulation and SQL or similar languages where the 'LOAD' statement is used. The question could be made even clearer by specifying the programming language or database system of interest, as the use of the 'LOAD' statement can vary across different systems.", 'verdict': '1'} [ragas.testset.evolutions.DEBUG] answer generated: {'answer': "Some examples of using the LOAD statement in data manipulation include loading a table into memory with 'LOAD A all;', loading specific columns of a table into memory with 'LOAD A (A,B);', and querying the load status of a table with 'SELECT loaded FROM m_cs_tables WHERE table_name = 'A';'.", 'verdict': '1'} Generating: 20%|██ | 1/5 [00:21<01:26, 21.51s/it][ragas.testset.filters.DEBUG] filtered question: {'feedback': "The question is clear in its intent, asking for examples of how to use the 'LOAD' statement in data manipulation. It specifies the topic of interest (the 'LOAD' statement) and the context (data manipulation), and it seeks a specific type of information (examples). This makes the question understandable and answerable for those with knowledge in data manipulation and SQL or similar languages where a 'LOAD' statement might be used. However, to improve clarity, the question could specify the programming language or database system of interest, as the use of the 'LOAD' statement can vary across different systems.", 'verdict': '1'} [ragas.testset.evolutions.DEBUG] answer generated: {'answer': "Some examples of using the LOAD statement in data manipulation include loading a table into memory with 'LOAD A all;', loading specific columns of a table into memory with 'LOAD A (A,B);', and querying the load status of a table with 'SELECT loaded FROM m_cs_tables WHERE table_name = 'A';'.", 'verdict': '1'} Generating: 40%|████ | 2/5 [00:30<00:42, 14.20s/it][ragas.testset.evolutions.DEBUG] answer generated: {'answer': "Some examples of using the LOAD statement in data manipulation include loading a table into memory with 'LOAD A all;', loading specific columns of a table into memory with 'LOAD A (A,B);', and querying the load status of a table with 'SELECT loaded FROM m_cs_tables WHERE table_name = 'A';'.", 'verdict': '1'} Generating: 60%|██████ | 3/5 [00:32<00:17, 8.75s/it][ragas.testset.evolutions.DEBUG] answer generated: {'answer': "Some examples of using the LOAD statement in data manipulation include loading a table into memory with 'LOAD A all;' or loading specific columns of a table into memory with 'LOAD A (A,B);'. You can also query the load status of a table using the m_cs_tables monitoring view with 'SELECT loaded FROM m_cs_tables WHERE table_name = 'A';'.", 'verdict': '1'} Generating: 80%|████████ | 4/5 [00:38<00:07, 7.64s/it]

xidian237 avatar Apr 02 '24 06:04 xidian237

I have the same issue with you. Have you handled the problem? I will appreciate it if you relpy me.

Kevin-JiXu avatar Apr 17 '24 07:04 Kevin-JiXu

@Kevin-JiXu, actually not yet.

xidian237 avatar Apr 17 '24 07:04 xidian237

is this problem solved? encounter the same issue.

xiyang-aads-lilly avatar Jul 10 '24 18:07 xiyang-aads-lilly

is this problem solved? encounter the same issue.

Unfortunately, I was not able to solve this problem. So I give up using this tool.

Kevin-JiXu avatar Jul 15 '24 02:07 Kevin-JiXu

@Kevin-JiXu @xiyang-aads-lilly that is unfortunate. with was fixed with #1093. do try it again and let us know if it works for you. if not do post your errors here. I would love to get you folks back in 🙂

jjmachan avatar Aug 02 '24 06:08 jjmachan