AI-Functions
AI-Functions copied to clipboard
Enhanced AI Functions: Introducing a new Flavor of AI Functions for Improved Results
Overview
This commit introduces an enhanced approach for generating AI functions' results, referred to as code-execution AI functions in this PR. These functions prompt the GPT model to generate code for the desired function and then execute it with the specified arguments. To run the GPT-provided code, we utilize a custom exec_with_return function that returns the result, in contrast to the commonly used exec function, which does not provide a return value.
The new approach achieves superior outcomes:
| Description | GPT-4 Result | GPT-3.5-turbo Result | Comment |
|---|---|---|---|
| Generate fake people | PASSED | PASSED | N/A |
| Generate Random Password | PASSED | PASSED | N/A |
| Calculate area of triangle | PASSED | PASSED | N/A |
| Calculate the nth prime number | PASSED | PASSED | N/A |
| Encrypt text | PASSED | FAILED | GPT-3 fails to generate compilable code. |
| Find missing numbers | PASSED | PASSED | N/A |
The only failure occurs with Encrypt text under GPT-3.5-turbo.
Further Improvements
One significant advantage of code-execution AI functions is their ability to detect GPT failures. Consequently, we can employ two strategies to overcome these failures:
- attempting error correction for the faulty code
- falling back to
code-less AI functions
Attempting Error Correction
If the generated code cannot be compiled, the error message can be fed back into the GPT model, prompting it to refine the code. Although error correction can attempt to improve Encrypt text in GPT-3.5-turbo, it cannot enhance it to the extent that it becomes executable.
Fallback to code-less AI functions
Fortunately, the original approach, termed code-less AI functions, can serve as a fallback when code-execution AI functions fail. For instance, code-less AI functions can produce a result for the Encrypt text test when using GPT-3.5-turbo. As a result, I have integrated code-less AI functions as a fallback for cases when code-execution AI functions are unsuccessful. This combined approach enables AI functions to pass all tests accurately:
| Description | GPT-4 Result | GPT-3.5-turbo Result | Comment |
|---|---|---|---|
| Generate fake people | PASSED | PASSED | N/A |
| Generate Random Password | PASSED | PASSED | N/A |
| Calculate area of triangle | PASSED | PASSED | N/A |
| Calculate the nth prime number | PASSED | PASSED | N/A |
| Encrypt text | PASSED | PASSED | N/A |
| Find missing numbers | PASSED | PASSED | N/A |
Conclusion
code-execution AI functions deliver superior results for two reasons:
- they provide more accurate results
- they can detect failures, allowing for the introduction of fallback and error correction mechanisms
Combining both code-execution AI functions and code-less AI functions yields the best results at this stage. It would be beneficial to incorporate additional tests in the future to further evaluate the capabilities and limitations of AI functions.