Ryan H. Tran
Ryan H. Tran
The evaluation for MATH subset now can be run using `bash ./evaluation/mint/run_infer.sh`. The result looks like below: ``` { "id": 0, "instance": { "task_name": "reasoning", "task_id": 0, "prompt": "What is...
I tested locally with the 2 first examples and both passed now. Can you try again? @yufansong
I'm not sure if I'm understanding correctly, but this implementation seems to be a bit different from the original `ProntoQA`. Here we feed the program in advance to the agent...
> I have not read the original paper. Could you tell me the difference between `original ProntoQA` and this implementation? From my understanding, the original implementation gives the model the...
> You are definitely right. The raw ProntoQA dataset does not provide any symbolic language expression or corresponding programs. The logic of a neuro-symbolic method is to 1) parse logic...
Not sure if I'm missing anything, but when I ran the evaluation, there were a bunch of `permission denied` logs and the agent seemed to fail since the log file...
> > Not sure if I'm missing anything, but when I ran the evaluation, there were a bunch of `permission denied` logs and the agent seemed to fail since the...
Hi @LakshyAAAgrawal, thanks for offering your help! I've tried to integrate `multilspy` into the [openhands-aci](https://github.com/All-Hands-AI/openhands-aci/) before (at [this branch](https://github.com/All-Hands-AI/openhands-aci/tree/ht/integrate-lsp)) but didn't finish it. Would be great if you can have...
> This may or may not have some licensing restrictions. Since ACR is under GPL-3.0 Oh yeah I didn't notice this, maybe it's not desirable to use the original code...
> Instead of ACR, how about implementing 'mentat auto context'. This issue aims to provide the agent with tools more tailored for coding tasks, improving upon the available search tools...