make integration test a little looser
@li-boxuan how bad an idea is this?
The idea is that instead of string-matching the prompts, we just iterate through the responses, one by one
The idea is that instead of string-matching the prompts, we just iterate through the responses, one by one
I thought so at the beginning as well. The upside is clear - CI won't fail if one just modifies the prompt by a little bit.
The downsides are debatable though:
- One can break the agent, say, by always sending nonsense to LLM, and still get all these tests pass. A malicious attempt will mostly likely be caught by reviewers, but a true bug may not.
- I feel like keeping the prompt files as test artifacts is a good way for developers and even users to learn what are being sent to the LLM.
Going to close this--seems like there hasn't been as much integration test pain lately. But we should definitely work on a better way to generate the prompt files
That’s on my radar! Planning to work on that next week.