xland-minigrid icon indicating copy to clipboard operation
xland-minigrid copied to clipboard

move baselines to separate repo and add MAML

Open Howuhh opened this issue 11 months ago • 7 comments

or reptile as a more simpler variation?

Howuhh avatar Mar 23 '24 13:03 Howuhh

Hey! Do you seek to adapt any particular maml implementation or just redo one from the paper?

alexunderch avatar Mar 23 '24 14:03 alexunderch

Hi @alexunderch! Personally, I am totally new to the gradient-based side of meta-RL and will definitely do first iteration based on some reference implementation to properly understand how it all works. I am planning to use clean implementation from @EdanToledo and explore some variants described here.

Howuhh avatar Mar 23 '24 14:03 Howuhh

Hey! Just a heads up, my implementation of MAML was really off the cuff. I haven't thoroughly tested it in any capacity. It was just me trying to do basic MAML in my free time after work.

EdanToledo avatar Mar 23 '24 20:03 EdanToledo

It means that it has an opportunity to be tested and improved

alexunderch avatar Mar 23 '24 20:03 alexunderch

@EdanToledo understandable! Unfortunately I have not found any other implementation on jax + ppo. Do you have any references you used when you were figuring it out (besides the original paper)?

Howuhh avatar Mar 23 '24 20:03 Howuhh

I think if there are not many reliable implementations, it doesn't sound too hard to test on gymnax-like envs. I am interested in trying MAML for MARL experiments too.

alexunderch avatar Mar 23 '24 21:03 alexunderch

@EdanToledo understandable! Unfortunately I have not found any other implementation on jax + ppo. Do you have any references you used when you were figuring it out (besides the original paper)?

I just used the paper tbh that's why it's super possible I messed up some aspects of it. I think there was some blog that implemented it in JAX but for supervised learning that I partially looked at but I don't remember it being that helpful.

EdanToledo avatar Mar 23 '24 21:03 EdanToledo