StableToolBench
StableToolBench copied to clipboard
A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.
Thank you for your great work! I just ran the evaluation pipeline and checked the pass rates for ```toolllama v2```, ```gpt3.5-turbo```, and ```gpt4-turbo```. However, all the pass rates are significantly...
I think the server is built locally. why we need a extra key for usage permission.
Hi, thank you for your great job. I am trying to reproduce the experimental results in the paper. I used the ToolLLM v2 model with the CoT method for reasoning,...
Hi, I was really impressed by StableToolBench I have submitted the form, but didn't received the Key or any response at all. Hope to be as quickly as I can...
I have applied multiple times for the ToolBench key through the form provided on GitHub, but have not yet received a response. Could you kindly provide the key? My email...
我观察到这一现象,是否说明服务器上的rapidAPI key是否已经失效? Observation: {"error": "", "response": "{'message': 'Invalid API key. Go to https://docs.rapidapi.com/docs/keys for more info.'}"}
Hi all, thank you very much for your work on StableToolBench. Do you provide the evaluation json files anywhere? I am referring to the json files containing the results of...