tan90º issues

Results 5 issues of


                                            tan90º

交流群满200人加不了了，能不能给个负责人的联系方式拉我进群？

Runtime error when using live demo

Hello, I encountered a runtime error when using the live demo on HuggingFace Space. The error message is as follows: ## error message ### Runtime error failed to create containerd...

Can an online assessment yield a score, or can the process of an offline assessment be visualized?

![1712585243173](https://github.com/OSU-NLP-Group/SeeAct/assets/28804414/9689c185-4160-4296-87b6-ce8baa2e4e37) I wanted to visualize how the model action on the Mind2Web dataset, but SeeAct didn't seem to do that. When computing online, the output "success_or_not" is always empty, which...

Can guidance support batch input?

I've been using guidance to evaluate a dataset recently, but guidance can only handle one input at a time, which is really slow, and I was wondering if support for...

[Bug/Assistance] mind2web的unknown是怎么回事？

**Describe the bug** 我在使用gpt-3.5-turbo复现AgentBench中的mind2web(m2w)时，注意到有35%的结果为unknown，在`runs.jsonl`中，这35%的unknown结果没有任何的输出。原以为是自己的问题，但注意到我复现出的分数与论文中的分数近乎一致（原论文20分，我的23分），所以这应该是AgentBench本身的问题，希望作者能修复这个unknown。 **Screenshots or Terminal Copy&Paste** ![image](https://github.com/THUDM/AgentBench/assets/28804414/98e00fe5-4f45-4295-ab7f-20070c38f422) ![image](https://github.com/THUDM/AgentBench/assets/28804414/ebfc8432-4318-4d58-a1b5-1b72dd281cc3) ![image](https://github.com/THUDM/AgentBench/assets/28804414/d8b726f0-6cc5-49f4-834f-7b264099c8ba) **Desktop (please complete the following information):** - OS: windows11 + WSL2(Ubuntu) + Docker

bug

help wanted