AgentBench issues

Revise Prompts to Comply with OpenAI API Policy

**Title**: Revise Prompts to Comply with OpenAI API Policy **Description**: ### Background Recent updates to the OpenAI API have introduced stricter content filtering policies, causing some of our existing prompts...

extreme1228

[Feature] 请求更新，目前数据看起来过时了。

截止2025年2月，目前涌现大量能力更强的LLM。

d3vw

enhancement

[Feature] Adding Large Reasoning Models Results

Hi AgentBench Team, Thanks for your awesome effort in constructing this benchmark. I would like to ask have you or plan to add the experimental results of large reasoning models...

Aaron617

enhancement

[Bug/Assistance] '{"detail":"Error: Task does not exist"}', 400, 'webshop-std'

3

I am trying to run the webshop-std but it shows that the task does not exist. May I ask why it will happen? ![error](https://github.com/user-attachments/assets/648631af-541c-4c3a-a8b1-d461bd86191b) ![error 2](https://github.com/user-attachments/assets/6972df6b-10d0-4602-8b9d-1b538259c976) Following is my config:...

AlphaLee1113

bug

help wanted

[Bug/Assistance] Fix example code for an os task

In data/os_interaction/data/dev.json, the example code for task "Find out count of linux users on this system who belong to at least 4 groups." is incorrect. The current example checks for...

hannagabor

bug

help wanted

Danny majority

dannyslowpark

pull request

1

genglongling

[Bug/Assistance] The wrong entity pattern in task knowledgegraph

**Describe the bug** A clear and concise description of what the bug is. In the code, Following code is used to check whether the input string is an entity: ```python...

caixd-220529

bug

help wanted

[Bug/Assistance] View UI

I want to view the UI like the demo video. Does anyone know how i can do this?

auxiliary-kimchi

bug

help wanted

[Bug/Assistance] Docker build failed

1

Does anyone run into 100 error on the docker build? ``` docker build -f data/os_interaction/res/dockerfiles/default data/os_interaction/res/dockerfiles --tag local-os/default ``` ``` 1.987 At least one invalid signature was encountered. 2.082 Get:3...

YerongLi

bug

help wanted

AgentBench
AgentBench copied to clipboard

Metadata

Revise Prompts to Comply with OpenAI API Policy

[Feature] 请求更新，目前数据看起来过时了。

[Feature] Adding Large Reasoning Models Results

[Bug/Assistance] '{"detail":"Error: Task does not exist"}', 400, 'webshop-std'

[Bug/Assistance] Fix example code for an os task

Danny majority

pull request

[Bug/Assistance] The wrong entity pattern in task knowledgegraph

[Bug/Assistance] View UI

[Bug/Assistance] Docker build failed

← Metadata

Owner

Metadata

AgentBench AgentBench copied to clipboard

Metadata

← Metadata

Owner

Metadata

AgentBench
AgentBench copied to clipboard