xland-minigrid
xland-minigrid copied to clipboard
move baselines to separate repo and add MAML
or reptile as a more simpler variation?
Hey! Do you seek to adapt any particular maml implementation or just redo one from the paper?
Hi @alexunderch! Personally, I am totally new to the gradient-based side of meta-RL and will definitely do first iteration based on some reference implementation to properly understand how it all works. I am planning to use clean implementation from @EdanToledo and explore some variants described here.
Hey! Just a heads up, my implementation of MAML was really off the cuff. I haven't thoroughly tested it in any capacity. It was just me trying to do basic MAML in my free time after work.
It means that it has an opportunity to be tested and improved
@EdanToledo understandable! Unfortunately I have not found any other implementation on jax + ppo. Do you have any references you used when you were figuring it out (besides the original paper)?
I think if there are not many reliable implementations, it doesn't sound too hard to test on gymnax-like envs. I am interested in trying MAML for MARL experiments too.
@EdanToledo understandable! Unfortunately I have not found any other implementation on jax + ppo. Do you have any references you used when you were figuring it out (besides the original paper)?
I just used the paper tbh that's why it's super possible I messed up some aspects of it. I think there was some blog that implemented it in JAX but for supervised learning that I partially looked at but I don't remember it being that helpful.