dspy icon indicating copy to clipboard operation
dspy copied to clipboard

Enhance error handling

Open kylerush opened this issue 1 year ago • 6 comments

Absolutely love the work on DSPY. Thank you to everyone for your effort.

I wanted to note that I think the error handing can be improved. It's not very clear most of the time what the error is. For example, if you forget the following:

turbo = dspy.OpenAI(model="gpt-3.5-turbo-0125", api_key=os.environ["OPENAI_API_KEY"])
dspy.settings.configure(lm=turbo)

You get this error:

Failed to run or to evaluate example [EXAMPLE] with <function validate_answer at 0xffff83f0c5e0> due to 'NoneType' object has no attribute 'kwargs'.

This led me down many hours of a wild goose chase until I realized I forgot to configure the LLM.

I'd be happy to look into how this specific Exception is handled and maybe suggest a different approach, but just wanted to note this here and get some reactions.

Thanks again!

kylerush avatar Apr 08 '24 14:04 kylerush

Thanks @kylerush ! Currently, there is a check for if the LM is configured predict which ensures the validation before any model generations.

But it's actually not getting hit since this error comes when you are using the dspy.BootstrapFewShot optimizer and the LM is being set for the teacher. When it tries to copy the kwargs, it triggers the error you got 'NoneType' object has no attribute 'kwargs'

I believe this can be simply resolved with the same check in predict, but we can add to the lm.copy method to make it a universal check, ensuring that an LM is loaded when a) copying the LM object or b) creating generations with it (already handled).

if not dsp.settings.lm:
            raise AssertionError("No LM is loaded.")

lmk if you had anything else in mind for a broader Exception handling before we push out a PR for this. thanks for the wild goose chase!!

arnavsinghvi11 avatar Apr 09 '24 19:04 arnavsinghvi11

Thanks for the thoughtful reply, @arnavsinghvi11! I think this solution could work. I'm also wondering if there is a larger opportunity to evaluate how exceptions are handled in DSPY. I'm generally wondering if the codebase is "swallowing" helpful exception information when it tries to be helpful by circumventing the default behavior of raising an exception (which comes with a full stack trace that is extremely helpful).

For example, here is a pattern that might be presenting a couple problems:

  1. This block has a "bare except" which is described as an anti-pattern here;
  2. It "swallows" the stack trace so you only get the error message, which makes it incredibly difficult to debug without diving into the source code;

I'm just writing this quickly as an example, but I think that perhaps a pattern like this could dramatically improve debugging errors:

except Exception as e:
    success = False
    with self.error_lock:
        self.error_count += 1
        current_error_count = self.error_count
    if current_error_count >= self.max_errors:
        raise e
    print(f"Failed to run or to evaluate example {example} with {self.metric} due to {e}.")
    raise e

The last line raises the exception, along with the helpful print context. Or this could be done as well:

except Exception as e:
    success = False
    with self.error_lock:
        self.error_count += 1
        current_error_count = self.error_count
    if current_error_count >= self.max_errors:
        raise e
    raise RuntimeError(f"Error in processing example {example} with metric {self.metric}.") from e

This approach gives both the helpful context and the original exception with the stack trace.

I haven't spent a ton of time looking at the code so I'm saying this without a lot of context on whether stopping execution of the program is desired — for me it would have been. Perhaps on a specific error it could be desired to stop the execution of the program.

My question to all of this is should we not modify the error handling instead of your proposed conditional check? I think I prefer letting exceptions be raised rather than trying to cleverly catch and explain the problem. The former helps a lot with debugging and the latter has gotten me into a lot of trouble by "swallowing" helpful information. I am open minded though and very curious about what you think.

Thank you for your time!

kylerush avatar Apr 09 '24 21:04 kylerush

Just coming back to this to re-emphasize my thought that a new pattern for error handling is probably needed. I'm running into a new error in my code and the output from DSPY is not enough to help me debug it. I have not figured out what the problem is yet.

Here is the output in DSPY:

Error for example in dev set:            list index out of range

No stack trace, no helpful debugging information other than the error message.

kylerush avatar Apr 11 '24 14:04 kylerush

Just want to note that I'd be happy to work on a pull request that proposes an approach to achieve easier debugging from errors inside of a DSPY program. Just thought we should chat about it first before I send a PR.

kylerush avatar Apr 11 '24 14:04 kylerush

Thanks @kylerush ! This would be great to add through a PR. The current behavior was designed to limit excessive outputted error messages, partly handled through how many errors can occur before completely halting execution but your proposed change would be useful to have that complete trace once halted.

arnavsinghvi11 avatar Apr 18 '24 17:04 arnavsinghvi11

Hey, I added a PR related to this as the lack of proper error logging has been the number 1 annoyance I have experienced when trying out DSPy for the past week or so. I just simply added the stacktrace to the printed error message, but if this seems to excessive we could parameterize this behaviour with some env variable for example

https://github.com/stanfordnlp/dspy/pull/1458

severi avatar Sep 06 '24 06:09 severi