OpenHands make integration test a little looser

@li-boxuan how bad an idea is this?

The idea is that instead of string-matching the prompts, we just iterate through the responses, one by one

Apr 27 '24 12:04 rbren

The idea is that instead of string-matching the prompts, we just iterate through the responses, one by one

I thought so at the beginning as well. The upside is clear - CI won't fail if one just modifies the prompt by a little bit.

The downsides are debatable though:

One can break the agent, say, by always sending nonsense to LLM, and still get all these tests pass. A malicious attempt will mostly likely be caught by reviewers, but a true bug may not.
I feel like keeping the prompt files as test artifacts is a good way for developers and even users to learn what are being sent to the LLM.

Apr 27 '24 17:04 li-boxuan

Going to close this--seems like there hasn't been as much integration test pain lately. But we should definitely work on a better way to generate the prompt files

May 05 '24 01:05 rbren

That’s on my radar! Planning to work on that next week.

May 05 '24 01:05 li-boxuan