rl-agents
rl-agents copied to clipboard
Gridworld scenario running issues
Sorry, i find that MCTSAgent fails to run on the gridworld test environments including DummyEnv/gridenv.json (This seems like a toy gridworld implemented by yourself) and Gridworld/empty.json (this is imported from gym_minigrid). I am doing a research project about MCTS these days and really want to test its performance on the gridworld environment. Your open-source code has provided a wonderful platform for test tree-search algorithms. Could you please help fix the gridworld testing issues? Thanks.
Hi @hebowei2000, I do not think there is anything wrong with the implemented algorithms (but I may be mistaken). I think you'll find that, perhaps surprisingly, MCTS algorithms are not very well suited to deal with gridworld-like environments, especially if they have sparse rewards. We give a possible explanation in this paper: basically, in the absence of rewards to guide exploration, the planning algorithms can only sample sequences uniformly. Yet, uniform exploration in the (tree-like) space of sequences of actions is NOT uniform exploration in the (graph-like) state space. Much like in a random walk process, the sampled states will tend to concentrate around the initial state. Indeed, many actions cancel eachother out, so undirected exploration will sample trajectories such as left-right-left-left-right-left which visit the same states back and forth.
There may also be other reasons, but it seems to me that seemingly simple gridworld problems (say 20x20 grids) are already very challenging for tree-based planners.
Thanks. I have fixed the related bugs when i try to run MCTS on the gridworld environment. I think what you have mentioned above provides me with some inspiration. Thanks all the same.