parsel
parsel copied to clipboard
Successful reproduction of the experiments on APPS by pure GPT3.5
Since Codex was deprecated by OpenAI, I tried to reproduce the experiments on the dataset APPS in Parsel paper by pure GPT3.5. Thanks to the code in branch saycan
, I fully understood your evalutation method. After a tough struggling to modify the prompts and Parsel itself, I finally reproduced a part of experiments mentioned in chapter 3.1 of the paper and even got better results: the pure GPT-3.5 version parsel(8x16) solved 27 of 100 randomly sampled competition-level problem in APPS. I offer the modified code for someone to use in the future.
Hey Yutong, could you share the modifications related to evaluations as well? I'm trying to reproduce the results on apps (27/100) according to your post.
Hey Yutong, could you share the modifications related to evaluations as well? I'm trying to reproduce the results on apps (27/100) according to your post.
Sorry, since a long time passed, I forgot many details about evaluations. See https://github.com/wyt2000/Automatic-ANPL/tree/apps for help.