habitat-lab icon indicating copy to clipboard operation
habitat-lab copied to clipboard

How to load a locally trained agent

Open yhymason opened this issue 3 years ago • 4 comments

Hi there, I have followed the tutorial https://aihabitat.org/tutorial/2020/ and trained a PPO agent locally. I wonder how I can specify the trained agent in my config files so that I can run evaluations on my agent.

yhymason avatar Dec 28 '21 06:12 yhymason

Hi, Just change run type to eval and make sure it’s loading the checkpoint you want to evaluate in the config:

‘python -u habitat_baselines/run.py --exp-config habitat_baselines/config/pointnav/ppo_pointnav_example.yaml --run-type eval’

On Mon, Dec 27, 2021 at 10:30 PM Haoyu Yang @.***> wrote:

Hi there, I have followed the tutorial https://aihabitat.org/tutorial/2020/ and trained a PPO agent locally. I wonder how I can specify the trained agent in my config files so that I can run evaluations on my agent.

— Reply to this email directly, view it on GitHub https://github.com/facebookresearch/habitat-lab/issues/780, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAGWHAGACWFXQ6XIFXV37YLUTFKRXANCNFSM5K3RZYXQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you are subscribed to this thread.Message ID: @.***>

mathfac avatar Dec 28 '21 06:12 mathfac

@mathfac thanks for the reply, I tried using my checkpoint file generated by the trainer(e.g. `new_checkpoints/ckpt.99.pth). However I got the following error:

File "/home/lci-user/Desktop/workspace/src/habitat-lab-0.2.0/habitat_baselines/agents/ppo_agents.py", line 94, in __init__
    for k, v in ckpt["state_dict"].items()
  File "/home/lci-user/anaconda3/envs/habitat2.0/lib/python3.6/site-packages/torch/nn/modules/module.py", line 1414, in load_state_dict
    self.__class__.__name__, "\n\t".join(error_msgs)))
RuntimeError: Error(s) in loading state_dict for PointNavResNetPolicy:
	Missing key(s) in state_dict: "net.visual_encoder.running_mean_and_var._mean", "net.visual_encoder.running_mean_and_var._var", "net.visual_encoder.running_mean_and_var._count". 

Any idea why? The agent I trained had only DEPTH_SENSOR enabled.

yhymason avatar Jan 04 '22 09:01 yhymason

@erikwijmans would you please take a look as well?

yhymason avatar Jan 06 '22 04:01 yhymason

Hi @yhymason, Attribute running_mean_and_var of PointNavResNetPolicy.net.visual_encoder has attributes _mean, _var, _count if and only if normalize_visual_inputs parameter of ResNetEncoder is set to True (see resnet_policy.py L373). And normalize_visual_inputs is set to True if and only if "rgb" in observation_space.spaces (see ppo_agents.py L84).

My assumption is that your checkpoint was trained with only DEPTH_SENSOR enabled, but you are trying to load it into the policy instance created with both RGB_SENSOR and DEPTH_SENSOR enabled.

More accurate answer would be possible if you could provide more details such as train/eval config files and trained checkpoint.

rpartsey avatar Jun 21 '22 16:06 rpartsey

Closing the issue because it has not had recent activity. @yhymason Feel free to re-open the issue, if you still have questions.

rpartsey avatar Sep 13 '22 06:09 rpartsey