dspy icon indicating copy to clipboard operation
dspy copied to clipboard

Need help on the observations of dspy experiments

Open GaneshSKulkarni opened this issue 3 months ago • 5 comments

Hi - I am working on chatbot to answer the questions from the document using RAG method. I have used DSPy framework for prompt tuning. I have done experimentation with DSPy for our use case and computed the performance using RAGAS Answer Correctness metrics. I have some observations regarding the performance of the prompts generated with DSPy.

For this experiment we are using GPT4-32K LLM. I have generated 2 prompts, one with uncompiled dspy signature and another custom prompt to be called via direct Azure OpenAI API. The same question and context (retrieved earlier) are sent along with both the prompts. When we inspect the prompts sent to the model both are same expect that one is called via the DSPy framework and the other directly via langchain APIs. The observation here is that the responses from direct API call prompt are detailed and performed better than DSPy prompt almost everytime.

Can anyone please provide some thoughts on why this might happen.

We also observed that DSPy always preprocess the input and replaces new line characters with spaces, there by combining all the chunks provided in the context. Is there a way to avoid this. (For the above experiment I replicated the behaviour for the direct API prompt by replacing new lines with spaces to compare the performance)

Thanks in advance.

GaneshSKulkarni avatar Apr 16 '24 16:04 GaneshSKulkarni

Hi @GaneshSKulkarni , you can use inspect_history to get some more observability on the prompts and outputs being passed in. If you are looking for more in-depth tracing, feel free to check out how to do so using Arise Phoenix in DSPy.

arnavsinghvi11 avatar Apr 18 '24 16:04 arnavsinghvi11

Hi @arnavsinghvi11

Thank you for the response and suggestions. Please find points herewith.

  • I have used inspect history and in-depth tracing of DSPy to make the observation that I have mentioned above. I have followed the in-depth notebooks for DSPy implementation.
  • To reiterate on my question, can you please provide your thoughts why DSPy is under performing and not able to utilize the context information provided. To mention again, I have used same prompts (took reference prompts thought inspect history) to generate the response with custom prompt to analyze the DSPy responses.
  • Why DSPy removes new line (‘\n’) character during preprocessing which result financial loss of information (observed via prompts of inspect history).

I would be happy to provide any information needed, please let me know. Please help me here to address these specific concerns.

Thank you.

GaneshSKulkarni avatar Apr 19 '24 05:04 GaneshSKulkarni

Hi @GaneshSKulkarni , you can use inspect_history to get some more observability on the prompts and outputs being passed in. If you are looking for more in-depth tracing, feel free to check out how to do so using Arise Phoenix in DSPy.

Hi @arnavsinghvi11 , Is there a way to save the trace locally in a file and load the traces in Phonenix later for inspection. Once the DSPy execution stops, the traces also go away. I would love to load the saved traces later for inspection.

Thanks

imflash217 avatar Apr 22 '24 01:04 imflash217

@GaneshSKulkarni , I have also experienced a performance degradation (almost always) with using DSPy while the direct LLM-API call via LangChainAI performs almost way better. In my case, the output given by DSPy program included the whole prompt itself, but now this issue is getting tracked by https://github.com/stanfordnlp/dspy/issues/662, but I am not sure if you are also getting this problem or something else.

If you can give some examples observations, it would be helpful to debug.

imflash217 avatar Apr 22 '24 01:04 imflash217

Hi @GaneshSKulkarni , you can use inspect_history to get some more observability on the prompts and outputs being passed in. If you are looking for more in-depth tracing, feel free to check out how to do so using Arise Phoenix in DSPy.

Hi @arnavsinghvi11 , Is there a way to save the trace locally in a file and load the traces in Phonenix later for inspection. Once the DSPy execution stops, the traces also go away. I would love to load the saved traces later for inspection.

Thanks

Does saving/loading help with this? DSPy programs and internal traces can be saved/loaded.

arnavsinghvi11 avatar Apr 27 '24 17:04 arnavsinghvi11