StableToolBench
StableToolBench copied to clipboard
A new tool learning benchmark aiming at well-balanced stability and reality, based on ToolBench.
Hi, excellent work! I have submitted the form, but didn't received the Key or any response at all. Hope to be as quickly as I can get the key, to...
Your impressive work has greatly impressed me. I would like to express my sincere thanks for your contribution to our research community. I intend to reproduce your excellent work for...
Hi! I'm working on running ToolLLaMa against the StableToolBench server, and noticed an issue. I am executing the following: ```bash python toolbench/inference/qa_pipeline.py \ --tool_root_dir data_example/toolenv/tools/ \ --backbone_model toolllama \ --model_path...
I'm testing the pass rate evaluation, could you offer the reproduction data like Toolbench? Thanks for your reply
请问下述报错是怎么回事? `Traceback (most recent call last): File "/home/guan/shared/StableToolBench/toolbench/tooleval/eval_pass_rate.py", line 100, in example, File "/root/anaconda3/envs/StableToolBench/lib/python3.9/concurrent/futures/_base.py", line 433, in result return self.__get_result() File "/root/anaconda3/envs/StableToolBench/lib/python3.9/concurrent/futures/_base.py", line 389, in __get_result raise self._exception File "/root/anaconda3/envs/StableToolBench/lib/python3.9/concurrent/futures/thread.py",...
Thanks for your work! When I tried to reproduce the inference, I encountered an issue. Is this normal? `openai.BadRequestError: Error code: 400 - {'error': {'message': "None is not of type...
great work for tool use!!! how ever, I had some question about the result. I would be grateful if you reply~~ ### 1. In `StableToolBench` I find pass rate result...
Thank you for your contribution. I noticed you mentioned that "We currently implement all models and algorithms supported by ToolBench." Could you clarify if this includes the DFSDT and ReACT...
Hi, I'm wondering why this benchmark don't have native LLM's result(such as llama2, llama3). Do you plan to add these results on this work?
import requests import json url = 'http://0.0.0.0:8005/virtual' # data = { # "category": "Media", # "tool_name": "newapi_for_media", # "api_name": "url", # "tool_input": {'url': 'https://api.socialmedia.com/friend/photos'}, # "strip": "", # "toolbench_key": "XXXXXXX"...