allenact
allenact copied to clipboard
TimeOut error when attempting to run pre-trained RoboThor model checkpoint
Problem
Unable to run pre-trained RoboThor model checkpoint
Steps to reproduce
Followed all instructions at https://allenact.org/tutorials/running-inference-on-a-pretrained-model/ Then ran:
PYTHONPATH=. python allenact/main.py \
training_a_pointnav_model \
-o pretrained_model_ckpts/robothor-pointnav-rgb-resnet/ \
-b projects/tutorials \
-c pretrained_model_ckpts/robothor-pointnav-rgb-resnet/checkpoints/PointNavRobothorRGBPPO/2020-08-31_12-13-30/exp_PointNavRobothorRGBPPO__stage_00__steps_000039031200.pt \
--eval
Got the error:
[02/10 14:58:23 ERROR:] Traceback (most recent call last):
File "/home/daniel/Documents/github/minigrid_experiments/third-party/allenact/allenact/algorithms/onpolicy_sync/engine.py", line 1992, in process_checkpoints
eval_package = self.run_eval(
File "/home/daniel/Documents/github/minigrid_experiments/third-party/allenact/allenact/algorithms/onpolicy_sync/engine.py", line 1782, in run_eval
num_paused = self.initialize_storage_and_viz(
File "/home/daniel/Documents/github/minigrid_experiments/third-party/allenact/allenact/algorithms/onpolicy_sync/engine.py", line 455, in initialize_storage_and_viz
observations = self.vector_tasks.get_observations()
File "/home/daniel/Documents/github/minigrid_experiments/third-party/allenact/allenact/algorithms/onpolicy_sync/engine.py", line 309, in vector_tasks
self._vector_tasks = VectorSampledTasks(
File "/home/daniel/Documents/github/minigrid_experiments/third-party/allenact/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 234, in __init__
observation_spaces = [
File "/home/daniel/Documents/github/minigrid_experiments/third-party/allenact/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 237, in <listcomp>
for space in read_fn(timeout_to_use=5 * self.read_timeout if self.read_timeout is not None else None) # type: ignore
File "/home/daniel/Documents/github/minigrid_experiments/third-party/allenact/allenact/algorithms/onpolicy_sync/vector_sampled_tasks.py", line 272, in read_with_timeout
raise TimeoutError(
TimeoutError: Did not receive output from `VectorSampledTask` worker for 300 seconds.
Expected behavior
Able to run inference and save metrics to tensorboard.
Desktop
Please add the following information:
- OS: Ubuntu 22.04
- AllenAct Version: commit 24907f16cd6aace1abb2fef90c8e8667859c38b8
Additional context
Running on Python 3.8 in Anaconda