gptscript Implement first-class approach to testing GPTScripts

Implement first-class approach to testing GPTScripts

Open tylerslaton opened this issue 11 months ago • 1 comments

I recently wrote out a sample of what an integration suite could look like entirely written in GPTScript. While we may not move forward with that specific approach I'd like to propose a first-class way to do this, not just for our examples.

My proposition here is a gptscript test <file>.gpt that calls a built-in tool along the lines of sys.test. Then the process becomes calling sys.test on a structured .test.gpt file where that file includes: how to test the file and what test cases there are.

Open to ideas around how this UX could be improved but I think that the approach of defining a standardized testing approach would be super beneficial.

Feb 29 '24 17:02 tylerslaton

Some of the trade-offs:

If we hand off testing to the LLM, sometimes results won't be consistent. It'll be good sometimes and worse others.
More structured testing removes some of the benefits that an LLM would provide.
I'd like to achieve some sort of middle-ground with this approach.

Feb 29 '24 18:02 tylerslaton

Our smoke tests do this to some extent now. While it's doesn't exactly have "first-class gptscript support", it does use gpt4-o to perform fuzzy equality on the output and tool calls from executing a script.

I'm going to close this out, but I think we definitely can and should reopen if we feel the smoke tests don't meet our needs here.

Jul 09 '24 05:07 njhale

gptscript gptscript copied to clipboard

Implement first-class approach to testing GPTScripts

gptscript
gptscript copied to clipboard