Multi-Agent-Reinforcement-Learning-Environment icon indicating copy to clipboard operation
Multi-Agent-Reinforcement-Learning-Environment copied to clipboard

Under what circumstances will the env_FireFighter be done?

Open lurenyi233 opened this issue 5 years ago • 1 comments

Thank you for your great work, I'm a new learner of the RL field, and I'm learning how to build my model.

for other environments, I can see how the game finish, such as agents put the box in the right location. but in the FireFighter environment, when will an episode be done?

I have tried to let fire level == [0,0,0,0] (or [2,2,2,2]) be the goal, at this time I will end the episode and give a positive reward. I use a DQN to learn the strategy, but it seems that whatever actions I choose, the fire level will increase especially the first and last house. I wonder how I can set the stopping criterion in this environment, do you have any idea? thank you!

lurenyi233 avatar Aug 31 '19 14:08 lurenyi233

This is a rewrite of environment in "Exploiting Locality of Interaction in Factored Dec-POMDPs". I have not used it yet, so I am not 100% sure about the correctness of the code (sorry). But it worth mentioning that 'done' thing is only in episodic environment and this problem should not be episodic, you may use a goal as put out all fires and see if DQN can finish that goal. I am not sure about if DQN can solve this problem, because it is partially observable and stochastic, it is hard though it seems to be simple.

Bigpig4396 avatar Aug 31 '19 22:08 Bigpig4396