[Question]: Clarification on HyDE prompt, chunking, and LLM settings for Ultradomain Winrate (Table 1 reproduction)

Open shshjhjh4455 opened this issue 8 months ago • 0 comments

Do you need to ask a question?

[ ] I have searched the existing question and discussions and this question is not already answered.
[ ] I believe this is a legitimate question, not just a bug or feature request.

Your Question

Hi, I'm currently reproducing the Ultradomain winrate results in Table 1 of the LightRAG paper, particularly the comparisons between LightRAG, HyDE, and GraphRAG.

I successfully constructed the graph using LightRAG and evaluated winrate performance using the HyDE library. However, I'm seeing significant performance gaps when using HyDE under different configurations.

Specifically:

When using HyDE with the default prompt, the generated answers often contain hallucinations, and the resulting winrate is significantly lower than what's reported in the paper.

When modifying the prompt to explicitly restrict hallucination and enforce information-grounded answers, the winrate improves notably. (I'll attach both winrate evaluation graphs in this issue.)

To properly reproduce the experiment, could you clarify:

What prompt template was used with HyDE in Table 1?

What chunk size and chunking strategy were used when processing the Ultradomain documents?

Which language model (e.g., OpenAI GPT-4, Claude, GPT-4o, etc.) was used to generate answers in the HyDE evaluation?

Were any additional hyperparameters changed (e.g., top-k retrieval count, number of hypotheses n in HyDE generation)?

And similarly, what were the initial settings for GraphRAG in Table 1?

I'm attaching:

📊 Graph 1: HyDE default prompt, chunk size 128 → low winrate

CHUNK_SIZE = 128

prompt_template = f"""
Based on the following information, please answer the question.
Question: {query}
Documents:
{{contexts}}
Answer:
"""

generator = OpenAIGenerator(
    model_name="gpt-4o-mini",
    n=8,
    max_tokens=512,
    temperature=0.7
)

📊 Graph 2: Custom hallucination-restricted prompt, same chunk size → improved winrate

CHUNK_SIZE = 128

prompt_template = f"""
        ---Role---
        You are a helpful assistant responding to user query
        ---Goal---
        Generate a concise response based on the following information and follow Response Rules. Do not include information not provided by following Information
        ---User Query---
        {query}
        ---Information---
        {{contexts}}
        ---Response Rules---
        - Use markdown formatting with appropriate section headings
        - Please respond in the same language as the user's question.
        - Ensure the response maintains continuity with the conversation history.
        - If you don't know the answer, just say so.
        - Do not make anything up. Do not include information not provided by the Infromation.

        Answer:
        """

generator = OpenAIGenerator(
    model_name="gpt-4o-mini",
    n=8,
    max_tokens=512,
    temperature=0.7
)

📊 Table 1 From the Paper

📝 Full evaluation script-adapted Custom hallucination-restricted prompt: HyDE Evaluation Gist

Thank you in advance for your help — accurate reproduction of the original settings would be extremely valuable!

Additional Context

No response

May 15 '25 02:05 shshjhjh4455