reflexion icon indicating copy to clipboard operation
reflexion copied to clipboard

actor in webshop does not intake the memory and relfexion

Open yananchen1989 opened this issue 1 year ago • 7 comments

hi there,

I am a bit confused about the reflexion for webshop. in code here, line 245, https://github.com/noahshinn/reflexion/blob/main/webshop_runs/webshop_trial.py the llm actor only intakes the base_prompt + prompt, which is the trajectory in current step, which is the left yellow block in the figure. image

however, it seems that the llm actor does not intake the env_history which is the right yellow block in the figure, which contains the memory and reflexion from previous trials (if trial > 1)

may I know if I miss something ?

if this llm actor surely does not intake the memory, can it be explained that there is no gain in webshop task, as you reported in the paper ? thanks.

yananchen1989 avatar Feb 01 '24 19:02 yananchen1989

i forked your repo and made changes here https://github.com/noahshinn/reflexion/compare/main...yananchen1989:reflexion:yc#diff-36a02556b49e22008fa36a519bf0cde61f8343559dfde60a3c229fb72176d00fR304

not sure it should be like that. please advise.

yananchen1989 avatar Feb 02 '24 23:02 yananchen1989

image my tested results. fyi

yananchen1989 avatar Feb 05 '24 21:02 yananchen1989

@yananchen1989 can I ask which columns belongs to your changes? with reflex?

theblackcat102 avatar Apr 13 '24 04:04 theblackcat102

@yananchen1989 can I ask which columns belongs to your changes? with reflex?

hello. line 304 @theblackcat102 action = llm_chat(str(env_history) + "\n\nAction:", stop=['\n']).strip().lstrip(' ') # fix the reflexion

yananchen1989 avatar Apr 13 '24 04:04 yananchen1989

i guess this could be a bug in the original code which causes the wrong conclusion with regards to webshop. correct me if I miss something

yananchen1989 avatar Apr 13 '24 04:04 yananchen1989

image my tested results. fyi

@yananchen1989 Hi, may I know which model does this result come from?

DZ9 avatar Sep 04 '24 02:09 DZ9

image my tested results. fyi

Wow! I missed this result, but thank you for finding this issue!

noahshinn avatar Sep 04 '24 04:09 noahshinn