ELF icon indicating copy to clipboard operation
ELF copied to clipboard

Failed to load the 80% winrate model

Open yhyu13 opened this issue 7 years ago • 6 comments

HI,

I installed ELF according to the install script and downloaded the trained model. After that, I tried to run ./eval_minirts.sh ./rts/game_MC/model/model-winrate-80.0-357800.bin 50, the 1st argument is the path to model, the 2nd argument is skipped frame per action (which is 50 as the README suggested). It throws the following error:

Load from ./rts/game_MC/model/model-winrate-80.0-357800.bin
/home/ubuntu/miniconda3/envs/elf/lib/python3.6/site-packages/torch/serialization.py:286: SourceChangeWarning: source code of class 'model.Model_ActorCritic' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
Traceback (most recent call last):
  File "eval.py", line 25, in <module>
    model = env["model_loaders"][0].load_model(GC.params)
  File "/home/ubuntu/ELF/rlpytorch/model_loader.py", line 82, in load_model
    model.load(self.load, omit_keys=omit_keys)
  File "/home/ubuntu/ELF/rlpytorch/model_base.py", line 107, in load
    self.load_state_dict(data["stats_dict"])
  File "/home/ubuntu/miniconda3/envs/elf/lib/python3.6/site-packages/torch/nn/modules/module.py", line 369, in load_state_dict
    raise KeyError('missing keys in state_dict: "{}"'.format(missing))
KeyError: 'missing keys in state_dict: "{\'Wt3.bias\', \'Wt.weight\', \'Wt2.weight\', \'Wt3.weight\', \'Wt.bias\', \'Wt2.bias\'}"'

I am guessing that the trained model is a little different from actor-critic, right?

yhyu13 avatar Dec 29 '17 02:12 yhyu13

I have the same problem. Do you solve it? @yhyu13

zhanghongjie101 avatar Dec 30 '17 06:12 zhanghongjie101

Ok. you need an additional option to load the model:

--omit_keys Wt,Wt1,Wt2,Wt3

Please try that first, if that doesn't work, let me know and I will try fixing that.

yuandong-tian avatar Dec 30 '17 22:12 yuandong-tian

@yuandong-tian @zhanghongjie101

I modified the commented flag in eval_minirts.sh (#--omit_keys Wt,Wt2,Wt3) to be yours, it fails to read it.

Load from ./rts/game_MC/model/model-winrate-80.0-357800.bin
Omit_keys = ['Wt', 'Wt1', 'Wt2', 'Wt3']
/home/ubuntu/miniconda3/envs/elf/lib/python3.6/site-packages/torch/serialization.py:286: SourceChangeWarning: source code of class 'model.Model_ActorCritic' has changed. you can retrieve the original source code by accessing the object's source attribute or set `torch.nn.Module.dump_patches = True` and use the patch tool to revert the changes.
  warnings.warn(msg, SourceChangeWarning)
Traceback (most recent call last):
  File "eval.py", line 25, in <module>
    model = env["model_loaders"][0].load_model(GC.params)
  File "/home/ubuntu/ELF/rlpytorch/model_loader.py", line 82, in load_model
    model.load(self.load, omit_keys=omit_keys)
  File "/home/ubuntu/ELF/rlpytorch/model_base.py", line 104, in load
    del data["stats_dict"][k + ".weight"]
KeyError: 'Wt.weight'

I am not familiar with displaying network parameters in pytorch so I will leave it to you guys.


The previous exception says: KeyError: 'missing keys in state_dict: "{\'Wt3.bias\', \'Wt.weight\', \'Wt2.weight\', \'Wt3.weight\', \'Wt.bias\', \'Wt2.bias\'}"'

So I guess the flag should be --omit_keys Wt,Wt2,Wt3. However, it still gives me KeyError: 'Wt.weight'.


I tried --omit_keys Wt2,Wt3, and it gives me KeyError: 'Wt2.weight'. The same thing happens when I try --omit_keys Wt3, and it gives me KeyError: 'Wt3.weight'.

I guess either the script that loads weights isn't right, or the saved model is corrupted.

yhyu13 avatar Dec 31 '17 06:12 yhyu13

I think the "model-winrate-80.0-357800.bin" missing some params (Wt,Wt2,Wt3). I trained a new model on my computer. And then it works. And the selfplay_minirts.sh should add "--keys_in_reply V", otherwise it will get "ValueError: Batch[actor1-2].copy_from: Reply[V] is not assigned". @yuandong-tian @yhyu13

zhanghongjie101 avatar Jan 02 '18 03:01 zhanghongjie101

@zhanghongjie101 yes. This is a known issue. I will fix it shortly.

yuandong-tian avatar Jan 10 '18 02:01 yuandong-tian

I have the same problem with @yhyu13

Tangent-Wei avatar Jan 25 '18 06:01 Tangent-Wei