gorilla icon indicating copy to clipboard operation
gorilla copied to clipboard

[BFCL] Wrong Format in the Possible Answers of Live Parallel Multiple

Open tanliboy opened this issue 5 months ago • 0 comments

Describe the issue There are several format issues in the possible answers of live test cases.

ID datapoint

  1. Datapoint / Model Handler permalink: https://github.com/ShishirPatil/gorilla/blob/main/berkeley-function-call-leaderboard/data/possible_answer/BFCL_v2_live_parallel_multiple.json Here are failed examples from my model (IMHO, most of them are valid.) sample.txt

What is the issue

  • Some live test cases are incorrectly treating the string type as an array of strings, which is causing the correct answers to fail.
  • There are some inconsistencies in the function names in the live tests. For instance, the function x**2 is being replaced with x^2, and the original lambda function names are being rejected.
  • Some test cases are translating certain function parameters. For example, if a user inputs a location in Chinese, the expected answers only accept the translated (non-Chinese) version, causing mismatches.

Proposed Changes

Correct the possible answers.

tanliboy avatar Sep 04 '24 18:09 tanliboy