Results 5 issues of tan90º

Hello, I encountered a runtime error when using the live demo on HuggingFace Space. The error message is as follows: ## error message ### Runtime error failed to create containerd...

![1712585243173](https://github.com/OSU-NLP-Group/SeeAct/assets/28804414/9689c185-4160-4296-87b6-ce8baa2e4e37) I wanted to visualize how the model action on the Mind2Web dataset, but SeeAct didn't seem to do that. When computing online, the output "success_or_not" is always empty, which...

I've been using guidance to evaluate a dataset recently, but guidance can only handle one input at a time, which is really slow, and I was wondering if support for...

**Describe the bug** 我在使用gpt-3.5-turbo复现AgentBench中的mind2web(m2w)时,注意到有35%的结果为unknown,在`runs.jsonl`中,这35%的unknown结果没有任何的输出。 原以为是自己的问题,但注意到我复现出的分数与论文中的分数近乎一致(原论文20分,我的23分),所以这应该是AgentBench本身的问题,希望作者能修复这个unknown。 **Screenshots or Terminal Copy&Paste** ![image](https://github.com/THUDM/AgentBench/assets/28804414/98e00fe5-4f45-4295-ab7f-20070c38f422) ![image](https://github.com/THUDM/AgentBench/assets/28804414/ebfc8432-4318-4d58-a1b5-1b72dd281cc3) ![image](https://github.com/THUDM/AgentBench/assets/28804414/d8b726f0-6cc5-49f4-834f-7b264099c8ba) **Desktop (please complete the following information):** - OS: windows11 + WSL2(Ubuntu) + Docker

bug
help wanted