StableToolBench icon indicating copy to clipboard operation
StableToolBench copied to clipboard

A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.

Results 17 StableToolBench issues
Sort by recently updated
recently updated
newest added

Thank you for your great work! I just ran the evaluation pipeline and checked the pass rates for ```toolllama v2```, ```gpt3.5-turbo```, and ```gpt4-turbo```. However, all the pass rates are significantly...

I think the server is built locally. why we need a extra key for usage permission.

Hi, thank you for your great job. I am trying to reproduce the experimental results in the paper. I used the ToolLLM v2 model with the CoT method for reasoning,...

Hi, I was really impressed by StableToolBench I have submitted the form, but didn't received the Key or any response at all. Hope to be as quickly as I can...

I have applied multiple times for the ToolBench key through the form provided on GitHub, but have not yet received a response. Could you kindly provide the key? My email...

我观察到这一现象,是否说明服务器上的rapidAPI key是否已经失效? Observation: {"error": "", "response": "{'message': 'Invalid API key. Go to https://docs.rapidapi.com/docs/keys for more info.'}"}

Hi all, thank you very much for your work on StableToolBench. Do you provide the evaluation json files anywhere? I am referring to the json files containing the results of...