habitat-lab icon indicating copy to clipboard operation
habitat-lab copied to clipboard

How to use pre-trained pytorch models?

Open Sunnyzhr opened this issue 3 years ago • 32 comments

❓ Questions and Help

According to the tutorials, I download pre-trained pytorch models. But where should I unzip and how to specify it by codes?

First, I unzip it to /home/u/Desktop/habitat-lab/habitat_baselines/habitat_baselines_v1(1). Second, I change codes deterministic=False, into deterministic="/home/u/Desktop/habitat-lab/habitat_baselines/habitat_baselines_v1(1)/rgbd.pth", in /home/u/Desktop/habitat-lab/habitat_baselines/agents/ppo_agents.py

Then I run command(habitat) root@c:~/Desktop/habitat-lab# python -u habitat_baselines/run.py --exp-config habitat_baselines/config/pointnav/ppo_pointnav_example.yaml --run-type eval I get the error:

WARNING:root:This caffe2 python run does not have GPU support. Will run in CPU only mode. 2020-10-26 15:53:53,103 config: BASE_TASK_CONFIG_PATH: configs/tasks/pointnav.yaml CHECKPOINT_FOLDER: data/new_checkpoints CHECKPOINT_INTERVAL: 50 CMD_TRAILING_OPTS: [] ENV_NAME: NavRLEnv EVAL: SPLIT: val USE_CKPT_CONFIG: True EVAL_CKPT_PATH_DIR: data/new_checkpoints FORCE_BLIND_POLICY: False LOG_FILE: train.log LOG_INTERVAL: 10 NUM_PROCESSES: 1 NUM_UPDATES: 10000 ORBSLAM2: ANGLE_TH: 0.2617993877991494 BETA: 100 CAMERA_HEIGHT: 1.25 DEPTH_DENORM: 10.0 DIST_REACHED_TH: 0.15 DIST_TO_STOP: 0.05 D_OBSTACLE_MAX: 4.0 D_OBSTACLE_MIN: 0.1 H_OBSTACLE_MAX: 1.25 H_OBSTACLE_MIN: 0.375 MAP_CELL_SIZE: 0.1 MAP_SIZE: 40 MIN_PTS_IN_OBSTACLE: 320.0 NEXT_WAYPOINT_TH: 0.5 NUM_ACTIONS: 3 PLANNER_MAX_STEPS: 500 PREPROCESS_MAP: True SLAM_SETTINGS_PATH: habitat_baselines/slambased/data/mp3d3_small1k.yaml SLAM_VOCAB_PATH: habitat_baselines/slambased/data/ORBvoc.txt RL: DDPPO: backbone: resnet50 distrib_backend: GLOO num_recurrent_layers: 2 pretrained: False pretrained_encoder: False pretrained_weights: data/ddppo-models/gibson-2plus-resnet50.pth reset_critic: True rnn_type: LSTM sync_frac: 0.6 train_encoder: True POLICY: OBS_TRANSFORMS: CENTER_CROPPER: HEIGHT: 256 WIDTH: 256 CUBE2EQ: CUBE_LENGTH: 256 HEIGHT: 256 SENSOR_UUIDS: [] WIDTH: 512 ENABLED_TRANSFORMS: () RESIZE_SHORTEST_EDGE: SIZE: 256 name: PointNavBaselinePolicy PPO: clip_param: 0.1 entropy_coef: 0.01 eps: 1e-05 gamma: 0.99 hidden_size: 512 lr: 0.00025 max_grad_norm: 0.5 num_mini_batch: 1 num_steps: 128 ppo_epoch: 4 reward_window_size: 50 tau: 0.95 use_gae: True use_linear_clip_decay: True use_linear_lr_decay: True use_normalized_advantage: True value_loss_coef: 0.5 REWARD_MEASURE: distance_to_goal SLACK_REWARD: -0.01 SUCCESS_MEASURE: spl SUCCESS_REWARD: 10.0 SENSORS: ['RGB_SENSOR', 'DEPTH_SENSOR'] SIMULATOR_GPU_ID: 0 TASK_CONFIG: DATASET: CONTENT_SCENES: ['*'] DATA_PATH: data/datasets/pointnav/habitat-test-scenes/v1/{split}/{split}.json.gz SCENES_DIR: data/scene_datasets SPLIT: train TYPE: PointNav-v1 ENVIRONMENT: ITERATOR_OPTIONS: CYCLE: True GROUP_BY_SCENE: True MAX_SCENE_REPEAT_EPISODES: -1 MAX_SCENE_REPEAT_STEPS: 10000 NUM_EPISODE_SAMPLE: -1 SHUFFLE: True STEP_REPETITION_RANGE: 0.2 MAX_EPISODE_SECONDS: 10000000 MAX_EPISODE_STEPS: 500 PYROBOT: BASE_CONTROLLER: proportional BASE_PLANNER: none BUMP_SENSOR: TYPE: PyRobotBumpSensor DEPTH_SENSOR: CENTER_CROP: False HEIGHT: 480 MAX_DEPTH: 5.0 MIN_DEPTH: 0.0 NORMALIZE_DEPTH: True TYPE: PyRobotDepthSensor WIDTH: 640 LOCOBOT: ACTIONS: ['BASE_ACTIONS', 'CAMERA_ACTIONS'] BASE_ACTIONS: ['go_to_relative', 'go_to_absolute'] CAMERA_ACTIONS: ['set_pan', 'set_tilt', 'set_pan_tilt'] RGB_SENSOR: CENTER_CROP: False HEIGHT: 480 TYPE: PyRobotRGBSensor WIDTH: 640 ROBOT: locobot ROBOTS: ['locobot'] SENSORS: ['RGB_SENSOR', 'DEPTH_SENSOR', 'BUMP_SENSOR'] SEED: 100 SIMULATOR: ACTION_SPACE_CONFIG: v0 AGENTS: ['AGENT_0'] AGENT_0: ANGULAR_ACCELERATION: 12.56 ANGULAR_FRICTION: 1.0 COEFFICIENT_OF_RESTITUTION: 0.0 HEIGHT: 1.5 IS_SET_START_STATE: False LINEAR_ACCELERATION: 20.0 LINEAR_FRICTION: 0.5 MASS: 32.0 RADIUS: 0.1 SENSORS: ['RGB_SENSOR'] START_POSITION: [0, 0, 0] START_ROTATION: [0, 0, 0, 1] DEFAULT_AGENT_ID: 0 DEPTH_SENSOR: HEIGHT: 256 HFOV: 90 MAX_DEPTH: 10.0 MIN_DEPTH: 0.0 NORMALIZE_DEPTH: True ORIENTATION: [0.0, 0.0, 0.0] POSITION: [0, 1.25, 0] TYPE: HabitatSimDepthSensor WIDTH: 256 FORWARD_STEP_SIZE: 0.25 HABITAT_SIM_V0: ALLOW_SLIDING: True ENABLE_PHYSICS: False GPU_DEVICE_ID: 0 GPU_GPU: False PHYSICS_CONFIG_FILE: ./data/default.phys_scene_config.json RGB_SENSOR: HEIGHT: 256 HFOV: 90 ORIENTATION: [0.0, 0.0, 0.0] POSITION: [0, 1.25, 0] TYPE: HabitatSimRGBSensor WIDTH: 256 SCENE: data/scene_datasets/habitat-test-scenes/van-gogh-room.glb SEED: 100 SEMANTIC_SENSOR: HEIGHT: 480 HFOV: 90 ORIENTATION: [0.0, 0.0, 0.0] POSITION: [0, 1.25, 0] TYPE: HabitatSimSemanticSensor WIDTH: 640 TILT_ANGLE: 15 TURN_ANGLE: 10 TYPE: Sim-v0 TASK: ACTIONS: ANSWER: TYPE: AnswerAction LOOK_DOWN: TYPE: LookDownAction LOOK_UP: TYPE: LookUpAction MOVE_FORWARD: TYPE: MoveForwardAction STOP: TYPE: StopAction TELEPORT: TYPE: TeleportAction TURN_LEFT: TYPE: TurnLeftAction TURN_RIGHT: TYPE: TurnRightAction ANSWER_ACCURACY: TYPE: AnswerAccuracy COLLISIONS: TYPE: Collisions COMPASS_SENSOR: TYPE: CompassSensor CORRECT_ANSWER: TYPE: CorrectAnswer DISTANCE_TO_GOAL: DISTANCE_TO: POINT TYPE: DistanceToGoal EPISODE_INFO: TYPE: EpisodeInfo GOAL_SENSOR_UUID: pointgoal_with_gps_compass GPS_SENSOR: DIMENSIONALITY: 2 TYPE: GPSSensor HEADING_SENSOR: TYPE: HeadingSensor IMAGEGOAL_SENSOR: TYPE: ImageGoalSensor INSTRUCTION_SENSOR: TYPE: InstructionSensor INSTRUCTION_SENSOR_UUID: instruction MEASUREMENTS: ['DISTANCE_TO_GOAL', 'SUCCESS', 'SPL'] OBJECTGOAL_SENSOR: GOAL_SPEC: TASK_CATEGORY_ID GOAL_SPEC_MAX_VAL: 50 TYPE: ObjectGoalSensor POINTGOAL_SENSOR: DIMENSIONALITY: 2 GOAL_FORMAT: POLAR TYPE: PointGoalSensor POINTGOAL_WITH_GPS_COMPASS_SENSOR: DIMENSIONALITY: 2 GOAL_FORMAT: POLAR TYPE: PointGoalWithGPSCompassSensor POSSIBLE_ACTIONS: ['STOP', 'MOVE_FORWARD', 'TURN_LEFT', 'TURN_RIGHT'] PROXIMITY_SENSOR: MAX_DETECTION_RADIUS: 2.0 TYPE: ProximitySensor QUESTION_SENSOR: TYPE: QuestionSensor SENSORS: ['POINTGOAL_WITH_GPS_COMPASS_SENSOR'] SOFT_SPL: TYPE: SoftSPL SPL: TYPE: SPL SUCCESS: SUCCESS_DISTANCE: 0.2 TYPE: Success SUCCESS_DISTANCE: 0.2 TOP_DOWN_MAP: DRAW_BORDER: True DRAW_GOAL_AABBS: True DRAW_GOAL_POSITIONS: True DRAW_SHORTEST_PATH: True DRAW_SOURCE: True DRAW_VIEW_POINTS: True FOG_OF_WAR: DRAW: True FOV: 90 VISIBILITY_DIST: 5.0 MAP_PADDING: 3 MAP_RESOLUTION: 1024 MAX_EPISODE_STEPS: 1000 TYPE: TopDownMap TYPE: Nav-v0 TENSORBOARD_DIR: tb TEST_EPISODE_COUNT: 2 TORCH_GPU_ID: 0 TRAINER_NAME: ppo VIDEO_DIR: video_dir VIDEO_OPTION: ['disk', 'tensorboard'] /home/u/anaconda3/envs/habitat/lib/python3.6/site-packages/tensorflow-1.13.1-py3.6-linux-x86_64.egg/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /home/u/anaconda3/envs/habitat/lib/python3.6/site-packages/tensorflow-1.13.1-py3.6-linux-x86_64.egg/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /home/u/anaconda3/envs/habitat/lib/python3.6/site-packages/tensorflow-1.13.1-py3.6-linux-x86_64.egg/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /home/u/anaconda3/envs/habitat/lib/python3.6/site-packages/tensorflow-1.13.1-py3.6-linux-x86_64.egg/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /home/u/anaconda3/envs/habitat/lib/python3.6/site-packages/tensorflow-1.13.1-py3.6-linux-x86_64.egg/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /home/u/anaconda3/envs/habitat/lib/python3.6/site-packages/tensorflow-1.13.1-py3.6-linux-x86_64.egg/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) Traceback (most recent call last): File "habitat_baselines/run.py", line 79, in main() File "habitat_baselines/run.py", line 40, in main run_exp(**vars(args)) File "habitat_baselines/run.py", line 75, in run_exp execute_exp(config, run_type) File "habitat_baselines/run.py", line 60, in execute_exp trainer.eval() File "/home/u/Desktop/habitat-lab/habitat_baselines/common/base_trainer.py", line 103, in eval self.config.EVAL_CKPT_PATH_DIR, prev_ckpt_ind File "/home/u/Desktop/habitat-lab/habitat_baselines/utils/common.py", line 123, in poll_checkpoint_folder f"invalid checkpoint folder " f"path {checkpoint_folder}" AssertionError: invalid checkpoint folder path data/new_checkpoints

Sunnyzhr avatar Oct 26 '20 08:10 Sunnyzhr

You still have to change EVAL_CKPT_PATH_DIR to point to the pretrained checkpoint you want to load.

Skylion007 avatar Nov 06 '20 16:11 Skylion007

HI, i am quite late to this issue but i didnt want to open a new one for a similar problem: I am trying to use the pretrained PPO and DDPPO models and im running into some issues (to make sure that its not a path issue - the checkpoint is loaded : 2020-11-16 20:09:19,161 =======current_ckpt: data/new_checkpoints/depth.pth=======) .

  1. the checkpoints dont seems to contain configs - config = self._setup_eval_config(ckpt_dict["config"]) fails with both the PPO models and the DDPPO models so i have to fall back onto the default config config = self.config.clone() in PPOTrainer for example. I dont have this issues with checkpoints i have generated myself. Is this a new change (i redownloaded and installed habitat to check so my code is current) ? If none of the checkpoint files contain configs, i guess for the PPO and DDPPO models the correct configs would be in the default.yaml files, is this correct? This brings up the problem of the varying DDPPO-models because as i understand it the best-performing depth model uses a 1024 state encoder, unlike the default ddppo yaml - is there a place where the different config files for the different dd-ppo checkpoints are stored ? This is the error: config = self._setup_eval_config(ckpt_dict["config"]) KeyError: 'config'
  2. I consistently run into naming issues with the checkpoint dicts, im assuming this is because of newer more informative names, the error is this: RuntimeError: Error(s) in loading state_dict for PPO: Missing key(s) in state_dict: "actor_critic.net.visual_encoder.cnn.0.weight", "actor_critic.net.visual_encoder.cnn.0.bias", "actor_critic.net.visual_encoder.cnn.2.weight", "actor_critic.net.visual_encoder.cnn.2.bias", "actor_critic.net.visual_encoder.cnn.4.weight", "actor_critic.net.visual_encoder.cnn.4.bias", "actor_critic.net.visual_encoder.cnn.6.weight", "actor_critic.net.visual_encoder.cnn.6.bias", "actor_critic.net.state_encoder.rnn.weight_ih_l0", "actor_critic.net.state_encoder.rnn.weight_hh_l0", "actor_critic.net.state_encoder.rnn.bias_ih_l0", "actor_critic.net.state_encoder.rnn.bias_hh_l0", "actor_critic.critic.fc.weight", "actor_critic.critic.fc.bias". Unexpected key(s) in state_dict: "actor_critic.net.cnn.0.weight", "actor_critic.net.cnn.0.bias", "actor_critic.net.cnn.2.weight", "actor_critic.net.cnn.2.bias", "actor_critic.net.cnn.4.weight", "actor_critic.net.cnn.4.bias", "actor_critic.net.cnn.6.weight", "actor_critic.net.cnn.6.bias", "actor_critic.net.rnn.weight_ih_l0", "actor_critic.net.rnn.weight_hh_l0", "actor_critic.net.rnn.bias_ih_l0", "actor_critic.net.rnn.bias_hh_l0", "actor_critic.net.critic_linear.weight", "actor_critic.net.critic_linear.bias". in short: I cant use the configs from the checkpoints (if they contain configs) and i cant load pretrained weights into my current models because of name mismatches. thanks ! EDIT: i wrote a script that simply renames the dictionary entries and everything works for the PPO case (if the DDPPO problem will be solved by just doubling the hidden_size then the same solution could apply there)

ronjamoller avatar Nov 16 '20 19:11 ronjamoller

You still have to change EVAL_CKPT_PATH_DIR to point to the pretrained checkpoint you want to load.

Thanks for your reply! @Skylion007 Sorry to bother you again, but I still don't know how to change the check point.

I don't find .pth documents in pre-trained pytorch models and find checkpoint related files in original /habitat-lab/data/test_checkpoints folder, so I define the checkpoints as below:

def get_default_config(): c = Config() c.INPUT_TYPE = "blind" # c.MODEL_PATH = "data/checkpoints/blind.pth" c.MODEL_PATH = r"/home/u/Desktop/habitat-lab/data/test_checkpoints/ppo/pointnav/ckpt.0.pth" c.RESOLUTION = 256 c.HIDDEN_SIZE = 512 c.RANDOM_SEED = 7 c.PTH_GPU_ID = 0 c.GOAL_SENSOR_UUID = "pointgoal_with_gps_compass" return c

Do I correctly define the checkpoints? Since there are only 4 files blind.pth,rgb.pth,depth.pth,rgbd.pth in pre-trained pytorch models

I really appreciate your help!

Sunnyzhr avatar Dec 06 '20 15:12 Sunnyzhr

Thanks for your attention!. Moreover, after deleting all and retrying the tutorials, I do not encounter the checkpoint any more. Instead, I get the error below:

....... TENSORBOARD_DIR: tb TEST_EPISODE_COUNT: 2 TORCH_GPU_ID: 0 TRAINER_NAME: ppo VIDEO_DIR: video_dir VIDEO_OPTION: ['disk', 'tensorboard'] 2020-12-06 23:20:59,906 Initializing dataset PointNav-v1 2020-12-06 23:21:00,907 Initializing dataset PointNav-v1 2020-12-06 23:21:00,909 initializing sim Sim-v0 WARNING: Logging before InitGoogleLogging() is written to STDERR I1206 23:21:00.915284 6450 AssetAttributesManager.cpp:122] Asset attributes (capsule3DSolid : capsule3DSolid_hemiRings_4_cylRings_1_segments_12_halfLen_0.75_useTexCoords_false_useTangents_false) created and registered. I1206 23:21:00.915319 6450 AssetAttributesManager.cpp:122] Asset attributes (capsule3DWireframe : capsule3DWireframe_hemiRings_8_cylRings_1_segments_16_halfLen_1) created and registered. I1206 23:21:00.915344 6450 AssetAttributesManager.cpp:122] Asset attributes (coneSolid : coneSolid_segments_12_halfLen_1.25_rings_1_useTexCoords_false_useTangents_false_capEnd_true) created and registered. I1206 23:21:00.915381 6450 AssetAttributesManager.cpp:122] Asset attributes (coneWireframe : coneWireframe_segments_32_halfLen_1.25) created and registered. I1206 23:21:00.915391 6450 AssetAttributesManager.cpp:122] Asset attributes (cubeSolid : cubeSolid) created and registered. I1206 23:21:00.915398 6450 AssetAttributesManager.cpp:122] Asset attributes (cubeWireframe : cubeWireframe) created and registered. I1206 23:21:00.915419 6450 AssetAttributesManager.cpp:122] Asset attributes (cylinderSolid : cylinderSolid_rings_1_segments_12_halfLen_1_useTexCoords_false_useTangents_false_capEnds_true) created and registered. I1206 23:21:00.915438 6450 AssetAttributesManager.cpp:122] Asset attributes (cylinderWireframe : cylinderWireframe_rings_1_segments_32_halfLen_1) created and registered. I1206 23:21:00.915462 6450 AssetAttributesManager.cpp:122] Asset attributes (icosphereSolid : icosphereSolid_subdivs_1) created and registered. I1206 23:21:00.915489 6450 AssetAttributesManager.cpp:122] Asset attributes (icosphereWireframe : icosphereWireframe_subdivs_1) created and registered. I1206 23:21:00.915503 6450 AssetAttributesManager.cpp:122] Asset attributes (uvSphereSolid : uvSphereSolid_rings_8_segments_16_useTexCoords_false_useTangents_false) created and registered. I1206 23:21:00.915516 6450 AssetAttributesManager.cpp:122] Asset attributes (uvSphereWireframe : uvSphereWireframe_rings_16_segments_32) created and registered. I1206 23:21:00.915522 6450 AssetAttributesManager.cpp:108] AssetAttributesManager::buildCtorFuncPtrMaps : Built default primitive asset templates : 12 I1206 23:21:00.915926 6450 ObjectAttributesManager.cpp:336] Parsing object library directory: ./data/objects I1206 23:21:00.915954 6450 ObjectAttributesManager.cpp:294] Loading file-based object template: ./data/objects/banana.phys_properties.json I1206 23:21:00.916010 6450 ObjectAttributesManager.cpp:294] Loading file-based object template: ./data/objects/cheezit.phys_properties.json I1206 23:21:00.916061 6450 ObjectAttributesManager.cpp:294] Loading file-based object template: ./data/objects/chefcan.phys_properties.json I1206 23:21:00.916110 6450 ObjectAttributesManager.cpp:294] Loading file-based object template: ./data/objects/largeclamp.phys_properties.json I1206 23:21:00.916186 6450 ObjectAttributesManager.cpp:294] Loading file-based object template: ./data/objects/skillet.phys_properties.json I1206 23:21:00.916262 6450 ObjectAttributesManager.cpp:306] Loaded file-based object templates: 5 I1206 23:21:00.916271 6450 PhysicsAttributesManager.cpp:39] File (./data/default.phys_scene_config.json) Based physics manager attributes created and registered. I1206 23:21:00.916357 6450 StageAttributesManager.cpp:79] File (data/scene_datasets/habitat-test-scenes/van-gogh-room.glb) Based stage attributes created and registered. I1206 23:21:00.916385 6450 Simulator.cpp:145] Loading navmesh from data/scene_datasets/habitat-test-scenes/van-gogh-room.navmesh I1206 23:21:00.916404 6450 Simulator.cpp:147] Loaded. I1206 23:21:00.916414 6450 SceneGraph.h:93] Created DrawableGroup: Renderer: GeForce GTX 1650/PCIe/SSE2 by NVIDIA Corporation OpenGL version: 4.6.0 NVIDIA 450.80.02 Using optional features: GL_ARB_ES2_compatibility GL_ARB_direct_state_access GL_ARB_get_texture_sub_image GL_ARB_invalidate_subdata GL_ARB_multi_bind GL_ARB_robustness GL_ARB_separate_shader_objects GL_ARB_texture_filter_anisotropic GL_ARB_texture_storage GL_ARB_texture_storage_multisample GL_ARB_vertex_array_object GL_KHR_debug Using driver workarounds: no-forward-compatible-core-context no-layout-qualifiers-on-old-glsl nv-zero-context-profile-mask nv-implementation-color-read-format-dsa-broken nv-cubemap-inconsistent-compressed-image-size nv-cubemap-broken-full-compressed-image-query nv-compressed-block-size-in-bits I1206 23:21:00.984727 6450 ResourceManager.cpp:934] Importing Basis files as BC7 I1206 23:21:01.581944 6450 ResourceManager.cpp:302] ResourceManager::loadStage : Not loading semantic mesh W1206 23:21:01.581966 6450 Simulator.cpp:235] : The active scene does not contain semantic annotations. I1206 23:21:01.582213 6450 simulator.py:181] Loaded navmesh data/scene_datasets/habitat-test-scenes/van-gogh-room.navmesh 2020-12-06 23:21:01,584 Initializing task Nav-v0 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:05<00:00, 2.39s/it]2020-12-06 23:21:11,371 Video created: video_dir/episode=26-ckpt=0-distance_to_goal=2.06-success=0.00-spl=0.00-collisions.count=0.00.mp4 IMAGEIO FFMPEG_WRITER WARNING: input image is not divisible by macro_block_size=16, resizing from (1002, 256) to (1008, 256) to ensure video compatibility with most codecs and players. To prevent resizing, make your input image divisible by the macro_block_size or set the macro_block_size to 1 (risking incompatibility). 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 15.41it/s] [swscaler @ 0x5f8ab00] Warning: data is not aligned! This can lead to a speed loss Traceback (most recent call last): File "/home/u/anaconda3/envs/habitat/lib/python3.6/site-packages/torch-1.7.0-py3.6-linux-x86_64.egg/torch/utils/tensorboard/summary.py", line 467, in make_video clip.write_gif(filename, verbose=False, logger=None) TypeError: write_gif() got an unexpected keyword argument 'verbose'

During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/home/u/anaconda3/envs/habitat/lib/python3.6/site-packages/torch-1.7.0-py3.6-linux-x86_64.egg/torch/utils/tensorboard/summary.py", line 470, in make_video clip.write_gif(filename, verbose=False, progress_bar=False) TypeError: write_gif() got an unexpected keyword argument 'verbose' During handling of the above exception, another exception occurred: Traceback (most recent call last): File "habitat_baselines/run.py", line 79, in main() File "habitat_baselines/run.py", line 40, in main run_exp(**vars(args)) File "habitat_baselines/run.py", line 75, in run_exp execute_exp(config, run_type) File "habitat_baselines/run.py", line 60, in execute_exp trainer.eval() File "/home/u/Desktop/habitat-lab/habitat_baselines/common/base_trainer.py", line 111, in eval checkpoint_index=prev_ckpt_ind, File "/home/u/Desktop/habitat-lab/habitat_baselines/rl/ppo/ppo_trainer.py", line 609, in _eval_checkpoint tb_writer=writer, File "/home/u/Desktop/habitat-lab/habitat_baselines/utils/common.py", line 175, in generate_video f"episode{episode_id}", checkpoint_idx, images, fps=fps File "/home/u/Desktop/habitat-lab/habitat_baselines/common/tensorboard_utils.py", line 67, in add_video_from_np_images video_name, video_tensor, fps=fps, global_step=step_idx File "/home/u/anaconda3/envs/habitat/lib/python3.6/site-packages/torch-1.7.0-py3.6-linux-x86_64.egg/torch/utils/tensorboard/writer.py", line 666, in add_video video(tag, vid_tensor, fps), global_step, walltime) File "/home/u/anaconda3/envs/habitat/lib/python3.6/site-packages/torch-1.7.0-py3.6-linux-x86_64.egg/torch/utils/tensorboard/summary.py", line 442, in video video = make_video(tensor, fps) File "/home/u/anaconda3/envs/habitat/lib/python3.6/site-packages/torch-1.7.0-py3.6-linux-x86_64.egg/torch/utils/tensorboard/summary.py", line 472, in make_video clip.write_gif(filename, verbose=False) TypeError: write_gif() got an unexpected keyword argument 'verbose' 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:05<00:00, 2.76s/it] Exception ignored in: <bound method VectorEnv.del of <habitat.core.vector_env.VectorEnv object at 0x7f3e2ac62898>> Traceback (most recent call last): File "/home/u/Desktop/habitat-lab/habitat/core/vector_env.py", line 529, in del File "/home/u/Desktop/habitat-lab/habitat/core/vector_env.py", line 413, in close File "/home/u/anaconda3/envs/habitat/lib/python3.6/multiprocessing/connection.py", line 206, in send AttributeError: 'NoneType' object has no attribute 'dumps' (habitat) root@c:~/Desktop/habitat-lab#

Sunnyzhr avatar Dec 06 '20 15:12 Sunnyzhr

Hi @Sunnyzhr I am not a developer but I have some experience with habitat - I am not sure if you have already solved your issues but i will try to go through them:

  1. You mentioned that you set a deterministic parameter to a path, i think this parameter should only be true or false, no checkpoint. Deterministic in this case refers to the type of policy i think, as in deterministic or stochastic, I would leave it at False.
  2. re: checkpoint files, if your goal is to simply evaluate a pretrained model, you can just put it into data/new_checkpoints/ and the eval script will automatically evaluate whats in thew new_checkpoints folder - depending on which model you want, just put for example the depth.pth file you downloaded into new_checkpoints. This is something to keep in mind for later - if there are several checkpoints in the folder the script will try to evaluate all of them, even if they dont actually fit your architecture so this folder should contain only the files you actually want to evaluate
  3. with the write_gif issue: this seems to be an issue in the tensorboard code - I have never had this issue but i suspect it might be a mismatch in your anaconda env. - if you have installed your anaconda env according to the instructions and only pulled stable releases this might be a bug. As a workaround I would try to modify the ppo_pointnav_example.yaml to remove any option that generated video for tensorboard: change VIDEO_OPTION: ["disk", "tensorboard"] to VIDEO_OPTION: ["disk"] or even VIDEO_OPTION: [] . This way the evaluation still runs through but no videos for manual inspection are generated. Hope this helps and sorry if i explained things you already fixed. best of luck

ronjamoller avatar Dec 06 '20 20:12 ronjamoller

I am also encountering similar problem, when trying to reproduce the results of pretrained models. !python -u habitat_baselines/run.py --exp-config habitat_baselines/config/pointnav/ppo_pointnav.yaml --run-type eval

I am getting following error: /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) 2020-12-30 15:09:23,572 =======current_ckpt: data/new_checkpoints/depth.pth======= Traceback (most recent call last): File "habitat_baselines/run.py", line 79, in main() File "habitat_baselines/run.py", line 40, in main run_exp(**vars(args)) File "habitat_baselines/run.py", line 75, in run_exp execute_exp(config, run_type) File "habitat_baselines/run.py", line 60, in execute_exp trainer.eval() File "/usr/local/lib/python3.6/site-packages/habitat_baselines/common/base_trainer.py", line 128, in eval checkpoint_index=prev_ckpt_ind, File "/usr/local/lib/python3.6/site-packages/habitat_baselines/rl/ppo/ppo_trainer.py", line 495, in _eval_checkpoint config = self._setup_eval_config(ckpt_dict["config"]) KeyError: 'config'

It will be very helpful if someone can suggest some solution.

KenaHemnani avatar Dec 30 '20 15:12 KenaHemnani

I have made a folder of new_checkpoints inside /content/habitat-lab/data/new_checkpoints/depth.pth and trying to run above command.

KenaHemnani avatar Dec 30 '20 15:12 KenaHemnani

Hi @KenaHemnani i am not a developer but I have some experience with habitat. I encountered your error and the issue is that newer checkpoint files also contain the config file they were trained with while the older ones do not. The PPO checkpoints are older. As a result you can use a workaround and just take the config specified by your yaml: The issue is the code in the ppo trainer that throws your error: if self.config.EVAL.USE_CKPT_CONFIG: config = self._setup_eval_config(ckpt_dict["config"]) else: config = self.config.clone() so it tries to get the dict but doesnt find it - instead you can do this: try: config = trainer._setup_eval_config(ckpt_dict["config"]) except: config = trainer.config.clone() or just clone the config by default. Edit: to preempt further issues :

  1. depending on your habitat installation, the pretrained models might not give good results and or produce more errors. I sometimes had issues where the layers of the networks were renamed - this can be fixed with a script that simple changes the names in the dict, there is an issue on github where the devs even share a script i think.
  2. also depending on your software version - the order of the moves of the robot has changed and as a result the robot that is using the pretrained ppo model might just move incorrectly (think left instead of right) - as a result i got a success rate of pretty much nothing. If this is the case you can try to change the order of the robot moves (forward, left, right etc.) or completely retrain.
  3. for me the ddppo-models had all the issues listed above but the robot moves correctly. Depending on what you are trying to do it might just be a better use of your time to go to those models. All of this advice is only based on my experiences (which might be caused by my code) and around 1 month out of date so things might have changed. Best of luck !

ronjamoller avatar Dec 30 '20 15:12 ronjamoller

@ronjamoller Thank you very much for the suggestion. I have tried using both ways as you mentioned above. But, both way getting same error as previous ones. Is there can be any other way?

Thank you for these suggestion. It will really helpful.

KenaHemnani avatar Dec 30 '20 18:12 KenaHemnani

When I try running: !python -u habitat_baselines/run.py --exp-config habitat_baselines/config/pointnav/ppo_pointnav_example.yaml --run-type train !python -u habitat_baselines/run.py --exp-config habitat_baselines/config/pointnav/ppo_pointnav_example.yaml --run-type eval

Training ppo_point_nav_example from scratch and then on eval mode. It is working fine. But, for checkpoints of pretrained models it is showing above errors.

KenaHemnani avatar Dec 30 '20 18:12 KenaHemnani

Have you tried replacing config = self._setup_eval_config(ckpt_dict["config"]) with config = trainer.config.clone() in the places where the program gives the error ? That is how I fixed that issue - as for your experiments, that is exactly the problem - newer checkpoint files contain a config entry, older checkpoint files do not. While it might be feasible to retrain the simple PPO model I think it is worth to try fixing this issue to get a good pretrained model to run later (the DDPPO models are way too big to train from scratch). tldr: if the error is the same, then there is still a part of the code that tries to access ckpt_dict["config"], which does not exist - you have the change them like above - it is uncomfortable to change the code in habitat_baselines/rl/ppo/ppo_trainer.py but i dont see another way. Good luck, dont be deterred !

ronjamoller avatar Dec 30 '20 18:12 ronjamoller

@ronjamoller Yes, I tried that way , even tried keeping only config = trainer.config.clone() .But it gives same error. Then from where self.config.EVAL.USE_CKPT_CONFIG this can be made false? I think it is still True as whatever information it prints before error.

KenaHemnani avatar Dec 30 '20 19:12 KenaHemnani

you will need a config later on - so you can set the parameter to false but you will still need config = trainer.config.clone() what is the error if you only have config = trainer.config.clone() ?

ronjamoller avatar Dec 30 '20 19:12 ronjamoller

I am getting following error:

Neither ifconfig (ifconfig -a) nor ip (ip address show`) commands are available, listing network interfaces is likely to fail 2020-12-30 19:27:52,521 config: BASE_TASK_CONFIG_PATH: configs/tasks/pointnav_gibson.yaml CHECKPOINT_FOLDER: data/new_checkpoints CHECKPOINT_INTERVAL: 2000 CMD_TRAILING_OPTS: [] ENV_NAME: NavRLEnv EVAL: SPLIT: val USE_CKPT_CONFIG: True EVAL_CKPT_PATH_DIR: data/new_checkpoints FORCE_BLIND_POLICY: False LOG_FILE: train.log LOG_INTERVAL: 25 NUM_PROCESSES: 4 NUM_UPDATES: 270000 ORBSLAM2:

. . . . . . . . .

TRAINER_NAME: ppo VIDEO_DIR: video_dir VIDEO_OPTION: [] /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:526: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint8 = np.dtype([("qint8", np.int8, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:527: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint8 = np.dtype([("quint8", np.uint8, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:528: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint16 = np.dtype([("qint16", np.int16, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:529: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_quint16 = np.dtype([("quint16", np.uint16, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:530: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. _np_qint32 = np.dtype([("qint32", np.int32, 1)]) /usr/local/lib/python3.6/site-packages/tensorflow/python/framework/dtypes.py:535: FutureWarning: Passing (type, 1) or '1type' as a synonym of type is deprecated; in a future version of numpy, it will be understood as (type, (1,)) / '(1,)type'. np_resource = np.dtype([("resource", np.ubyte, 1)]) 2020-12-30 19:27:55,489 =======current_ckpt: data/new_checkpoints/ckpt.0.pth======= Traceback (most recent call last): File "habitat_baselines/run.py", line 79, in main() File "habitat_baselines/run.py", line 40, in main run_exp(**vars(args)) File "habitat_baselines/run.py", line 75, in run_exp execute_exp(config, run_type) File "habitat_baselines/run.py", line 60, in execute_exp trainer.eval() File "/usr/local/lib/python3.6/site-packages/habitat_baselines/common/base_trainer.py", line 128, in eval checkpoint_index=prev_ckpt_ind, File "/usr/local/lib/python3.6/site-packages/habitat_baselines/rl/ppo/ppo_trainer.py", line 495, in _eval_checkpoint config = self._setup_eval_config(ckpt_dict["config"]) KeyError: 'config'`

KenaHemnani avatar Dec 30 '20 19:12 KenaHemnani

Thank you, as you can see the code still tries to access ckpt_dict["config"] - you have to change that section of the code in /usr/local/lib/python3.6/site-packages/habitat_baselines/rl/ppo/ppo_trainer.py edit: as you have pointed out, you can also set the self.config.EVAL.USE_CKPT_CONFIG to false but i dont think the default .yaml files contain that parameter so you would have to set it manually in run.py which is not really straightforward. It is set to True by default in habitat_baselines/common/default.py. So your options are to change that part of the code that throws the error, set the default value to False in default.py or try to set the value to false externally (habitat config files are very modular so i dont know if i would recommend that).

ronjamoller avatar Dec 30 '20 19:12 ronjamoller

Hii, @ronjamoller I tried what said. Thank you very much for help!! KeyError seems did not appear. But I am again getting some other error:

Traceback (most recent call last): File "habitat_baselines/run.py", line 79, in main() File "habitat_baselines/run.py", line 40, in main run_exp(**vars(args)) File "habitat_baselines/run.py", line 75, in run_exp execute_exp(config, run_type) File "habitat_baselines/run.py", line 60, in execute_exp trainer.eval() File "/usr/local/lib/python3.6/site-packages/habitat_baselines/common/base_trainer.py", line 128, in eval checkpoint_index=prev_ckpt_ind, File "/usr/local/lib/python3.6/site-packages/habitat_baselines/rl/ppo/ppo_trainer.py", line 512, in _eval_checkpoint self.envs = construct_envs(config, get_env_class(config.ENV_NAME)) File "/usr/local/lib/python3.6/site-packages/habitat_baselines/utils/env_utils.py", line 59, in construct_envs scenes = dataset.get_scenes_to_load(config.TASK_CONFIG.DATASET) File "/usr/local/lib/python3.6/site-packages/habitat/datasets/pointnav/pointnav_dataset.py", line 43, in get_scenes_to_load assert cls.check_config_paths_exist(config) AssertionError

KenaHemnani avatar Dec 30 '20 20:12 KenaHemnani

Congrats on taking the first hurdle :D - this looks like your dataset is not in the correct place - there are sections on the habitat frontpage about setting up your dataset - you will need the .json files that contain the train and test episodes (start and end point of the robot basically) and the corresponding files for the rooms (.glb and .navmesh). The .jsons have to be in a specific folder structure and so do the room files. In habitat-lab/habitat/config/default.py you can see that the default path for the jsons is data/datasets/pointnav/habitat-test-scenes/v1/{split}/{split}.json.gz - split is replaced by test, train, val so its 3 folders and each contain a json. You can download the jsons from the habitat website. The files for the houses (.glb and.navmesh) go into data/scene_datasets/habitat-test-scenes/ - to double check you can open the .json files and the entries will contain the path of the file that is needed for each episode.

ronjamoller avatar Dec 30 '20 20:12 ronjamoller

Yes, I loaded the dataset. Thank you very much @ronjamoller. Sorry to bother u again. After loading the dataset I am getting followiing: How shall I resolve this: Traceback (most recent call last): File "habitat_baselines/run.py", line 79, in main() File "habitat_baselines/run.py", line 40, in main run_exp(**vars(args)) File "habitat_baselines/run.py", line 75, in run_exp execute_exp(config, run_type) File "habitat_baselines/run.py", line 60, in execute_exp trainer.eval() File "/usr/local/lib/python3.6/site-packages/habitat_baselines/common/base_trainer.py", line 128, in eval checkpoint_index=prev_ckpt_ind, File "/usr/local/lib/python3.6/site-packages/habitat_baselines/rl/ppo/ppo_trainer.py", line 512, in _eval_checkpoint self.envs = construct_envs(config, get_env_class(config.ENV_NAME)) File "/usr/local/lib/python3.6/site-packages/habitat_baselines/utils/env_utils.py", line 102, in construct_envs workers_ignore_signals=workers_ignore_signals, File "/usr/local/lib/python3.6/site-packages/habitat/core/vector_env.py", line 153, in init read_fn() for read_fn in self._connection_read_fns File "/usr/local/lib/python3.6/site-packages/habitat/core/vector_env.py", line 153, in read_fn() for read_fn in self._connection_read_fns File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 250, in recv buf = self._recv_bytes() File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes buf = self._recv(4) File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 379, in _recv chunk = read(handle, remaining) ConnectionResetError: [Errno 104] Connection reset by peer Exception ignored in: <bound method VectorEnv.del of <habitat.core.vector_env.VectorEnv object at 0x7f5bb975f898>> Traceback (most recent call last): File "/usr/local/lib/python3.6/site-packages/habitat/core/vector_env.py", line 553, in del self.close() File "/usr/local/lib/python3.6/site-packages/habitat/core/vector_env.py", line 419, in close write_fn((CLOSE_COMMAND, None)) File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 206, in send self._send_bytes(_ForkingPickler.dumps(obj)) File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 404, in _send_bytes self._send(header + buf) File "/usr/local/lib/python3.6/multiprocessing/connection.py", line 368, in _send n = write(self._handle, buf) BrokenPipeError: [Errno 32] Broken pipe

KenaHemnani avatar Dec 30 '20 20:12 KenaHemnani

This seems to be an actual issue in the habitat communication procedure, which is unfortunate. I have found other references to this here and here on github but it seems that it was either not resolved or resolved through checking out different branches. So unfortunately i am not knowledgeable enough for this - i dont know what version of habitat you are using but something that has worked for me so far (after rebooting my computer and or restarting the anaconda env - dont laugh ...) is to install the latest stable version of habitat-sim (the conda version works for me) and habitat-api according the the installation guide (i dont use docker). I hope this helps although i cant really tell you the issue in this case. Edit: i just remembered that there is a test script that runs a training process python examples/benchmark.py - was this successful for you ? I think this would also use the communication structure that is listed in your error (although devs have mentioned that the issue is usually something before and not the communication itself, but from your error mesage i dont see anything else)

ronjamoller avatar Dec 30 '20 20:12 ronjamoller

newer checkpoint files contain a config entry, older checkpoint files do not.

The configs have been removed from released models to not tie those weights to a specific version of the config. We'd have to update those configs for even minor things like adding new fields or changing existing field names.

as you have pointed out, you can also set the self.config.EVAL.USE_CKPT_CONFIG to false but i dont think the default .yaml files contain that parameter so you would have to set it manually in run.py which is not really straightforward. It is set to

You can add that to any of the .yaml files as

EVAL:
    USE_CKPT_CONFIG: False

@KenaHemnani Seeing communication fail on init likely means the subprocesses are getting killed for some reason. My guess would be out-of-memory.

erikwijmans avatar Jan 03 '21 02:01 erikwijmans

Thank you @erikwijmans ! your solution is much more elegant, thank you. Thank you as well for clearing up the config issue - I had (wrongly) assumed that it was introduced when the models got more and more complex to keep track of the changes, I hadnt thought of the involved maintenance. As for the memory issue - this might happen with a big dataset such as matterport or gibson i guess ? The habitat test scenes dataset should not be too big.

ronjamoller avatar Jan 04 '21 08:01 ronjamoller

As for the memory issue - this might happen with a big dataset such as matterport or gibson i guess ? The habitat test scenes dataset should not be too big.

True. @KenaHemnani can you type dmesg immediately after failure and paste part of that log here? That may give an idea what is going on.

erikwijmans avatar Jan 04 '21 23:01 erikwijmans

Hello @erikwijmans , I am getting following :(after typing dmesg) `[ 0.000000] Linux version 4.19.112+ (builder@a12462ca91c8) (Chromium OS 10.0_pre377782_p20200113-r10 clang version 10.0.0 (/var/cache/chromeos-cache/distfiles/host/egit-src/llvm-project 4e8231b5cf0f5f62c7a51a857e29f5be5cb55734)) #1 SMP Thu Jul 23 08:00:38 PDT 2020 [ 0.000000] Command line: BOOT_IMAGE=/syslinux/vmlinuz.A init=/usr/lib/systemd/systemd boot=local rootwait ro noresume noswap loglevel=7 noinitrd console=ttyS0 security=apparmor virtio_net.napi_tx=1 systemd.unified_cgroup_hierarchy=false systemd.legacy_systemd_cgroup_controller=false csm.disabled=1 dm_verity.error_behavior=3 dm_verity.max_bios=-1 dm_verity.dev_wait=1 i915.modeset=1 cros_efi loadpin.enabled=0 module.sig_enforce=0 root=/dev/dm-0 "dm=1 vroot none ro 1,0 4077568 verity payload=PARTUUID=555BDB75-CBD7-CD4A-B24E-29B13D7AC0DF hashtree=PARTUUID=555BDB75-CBD7-CD4A-B24E-29B13D7AC0DF hashstart=4077568 alg=sha256 root_hexdigest=42104d547ac104fb7061529e78f53e4f3e8c3d3cbb040dc6e0f84aad68491347 salt=9dc7f3acc4e2ce65be16356e960c2b21b51a917fa31d2e891fd295490c991e41" mitigations=off [ 0.000000] x86/fpu: Supporting XSAVE feature 0x001: 'x87 floating point registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x002: 'SSE registers' [ 0.000000] x86/fpu: Supporting XSAVE feature 0x004: 'AVX registers' [ 0.000000] x86/fpu: xstate_offset[2]: 576, xstate_sizes[2]: 256 [ 0.000000] x86/fpu: Enabled xstate features 0x7, context size is 832 bytes, using 'standard' format. [ 0.000000] BIOS-provided physical RAM map: [ 0.000000] BIOS-e820: [mem 0x0000000000000000-0x0000000000000fff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000001000-0x0000000000054fff] usable [ 0.000000] BIOS-e820: [mem 0x0000000000055000-0x000000000005ffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000060000-0x0000000000097fff] usable [ 0.000000] BIOS-e820: [mem 0x0000000000098000-0x000000000009ffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000000100000-0x00000000bd956fff] usable [ 0.000000] BIOS-e820: [mem 0x00000000bd957000-0x00000000bd959fff] ACPI data [ 0.000000] BIOS-e820: [mem 0x00000000bd95a000-0x00000000bd95afff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000bd95b000-0x00000000bdadcfff] usable [ 0.000000] BIOS-e820: [mem 0x00000000bdadd000-0x00000000bdae4fff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000bdae5000-0x00000000bdb1afff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000bdb1b000-0x00000000beb9afff] usable [ 0.000000] BIOS-e820: [mem 0x00000000beb9b000-0x00000000bebf2fff] reserved [ 0.000000] BIOS-e820: [mem 0x00000000bebf3000-0x00000000bebfafff] ACPI data [ 0.000000] BIOS-e820: [mem 0x00000000bebfb000-0x00000000bebfefff] ACPI NVS [ 0.000000] BIOS-e820: [mem 0x00000000bebff000-0x00000000bffdffff] usable [ 0.000000] BIOS-e820: [mem 0x00000000bffe0000-0x00000000bfffffff] reserved [ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000037fffffff] usable [ 0.000000] NX (Execute Disable) protection: active [ 0.000000] efi: EFI v2.70 by EDK II [ 0.000000] efi: ACPI=0xbebfa000 ACPI 2.0=0xbebfa014 SMBIOS=0xbebcd000 MEMATTR=0xbdea0198 TPMEventLog=0xbd2af018 [ 0.000000] SMBIOS 2.4 present. [ 0.000000] DMI: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/01/2011 [ 0.000000] Hypervisor detected: KVM [ 0.000000] kvm-clock: Using msrs 4b564d01 and 4b564d00 [ 0.000000] kvm-clock: cpu 0, msr 141600001, primary cpu clock [ 0.000000] kvm-clock: using sched offset of 6181546513 cycles [ 0.000002] clocksource: kvm-clock: mask: 0xffffffffffffffff max_cycles: 0x1cd42e4dffb, max_idle_ns: 881590591483 ns [ 0.000003] tsc: Initial usec timer 3992678 [ 0.000004] tsc: Detected 2200.000 MHz processor [ 0.000080] e820: update [mem 0x00000000-0x00000fff] usable ==> reserved [ 0.000082] e820: remove [mem 0x000a0000-0x000fffff] usable [ 0.000086] last_pfn = 0x380000 max_arch_pfn = 0x400000000 [ 0.000116] MTRR default type: write-back [ 0.000116] MTRR fixed ranges enabled: [ 0.000117] 00000-9FFFF write-back [ 0.000118] A0000-FFFFF uncachable [ 0.000118] MTRR variable ranges enabled: [ 0.000119] 0 base 0000C0000000 mask 3FFFC0000000 uncachable [ 0.000120] 1 disabled [ 0.000120] 2 disabled [ 0.000120] 3 disabled [ 0.000121] 4 disabled [ 0.000121] 5 disabled [ 0.000121] 6 disabled [ 0.000122] 7 disabled [ 0.000131] x86/PAT: Configuration [0-7]: WB WC UC- UC WB WP UC- WT
[ 0.000137] last_pfn = 0xbffe0 max_arch_pfn = 0x400000000 [ 0.013012] Scanning 1 areas for low memory corruption [ 0.013038] Kernel/User page tables isolation: disabled on command line. [ 0.013043] Using GB pages for direct mapping [ 0.013045] BRK [0x141801000, 0x141801fff] PGTABLE [ 0.013048] BRK [0x141802000, 0x141802fff] PGTABLE [ 0.013049] BRK [0x141803000, 0x141803fff] PGTABLE [ 0.013064] BRK [0x141804000, 0x141804fff] PGTABLE [ 0.013104] BRK [0x141805000, 0x141805fff] PGTABLE [ 0.013120] BRK [0x141806000, 0x141806fff] PGTABLE [ 0.013130] BRK [0x141807000, 0x141807fff] PGTABLE [ 0.013152] BRK [0x141808000, 0x141808fff] PGTABLE [ 0.013172] BRK [0x141809000, 0x141809fff] PGTABLE [ 0.013186] Secure boot disabled [ 0.013195] ACPI: Early table checksum verification disabled [ 0.013198] ACPI: RSDP 0x00000000BEBFA014 000024 (v02 Google) [ 0.013201] ACPI: XSDT 0x00000000BEBF90E8 00005C (v01 Google GOOGFACP 00000001 01000013) [ 0.013206] ACPI: FACP 0x00000000BEBF4000 0000F4 (v02 Google GOOGFACP 00000001 GOOG 00000001) [ 0.013210] ACPI: DSDT 0x00000000BEBF5000 0018BA (v01 Google GOOGDSDT 00000001 GOOG 00000001) [ 0.013213] ACPI: FACS 0x00000000BD95A000 000040 [ 0.013216] ACPI: SSDT 0x00000000BEBF8000 000316 (v02 GOOGLE Tpm2Tabl 00001000 INTL 20160527) [ 0.013219] ACPI: TPM2 0x00000000BEBF7000 000034 (v04 GOOGLE 00000001 GOOG 00000001) [ 0.013222] ACPI: SRAT 0x00000000BEBF3000 0000C8 (v03 Google GOOGSRAT 00000001 GOOG 00000001) [ 0.013225] ACPI: APIC 0x00000000BD959000 000076 (v05 Google GOOGAPIC 00000001 GOOG 00000001) [ 0.013228] ACPI: SSDT 0x00000000BD958000 000980 (v01 Google GOOGSSDT 00000001 GOOG 00000001) [ 0.013231] ACPI: WAET 0x00000000BD957000 000028 (v01 Google GOOGWAET 00000001 GOOG 00000001) [ 0.013237] ACPI: Local APIC address 0xfee00000 [ 0.013254] SRAT: PXM 0 -> APIC 0x00 -> Node 0 [ 0.013256] SRAT: PXM 0 -> APIC 0x01 -> Node 0 [ 0.013260] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff] [ 0.013262] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0xbfffffff] [ 0.013264] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x37fffffff] [ 0.013268] NUMA: Node 0 [mem 0x00000000-0x0009ffff] + [mem 0x00100000-0xbfffffff] -> [mem 0x00000000-0xbfffffff] [ 0.013271] NUMA: Node 0 [mem 0x00000000-0xbfffffff] + [mem 0x100000000-0x37fffffff] -> [mem 0x00000000-0x37fffffff] [ 0.013276] NODE_DATA(0) allocated [mem 0x37fffc000-0x37fffffff] [ 0.013307] Zone ranges: [ 0.013308] DMA [mem 0x0000000000001000-0x0000000000ffffff] [ 0.013309] DMA32 [mem 0x0000000001000000-0x00000000ffffffff] [ 0.013310] Normal [mem 0x0000000100000000-0x000000037fffffff] [ 0.013311] Movable zone start for each node [ 0.013312] Early memory node ranges [ 0.013313] node 0: [mem 0x0000000000001000-0x0000000000054fff] [ 0.013314] node 0: [mem 0x0000000000060000-0x0000000000097fff] [ 0.013315] node 0: [mem 0x0000000000100000-0x00000000bd956fff] [ 0.013315] node 0: [mem 0x00000000bd95b000-0x00000000bdadcfff] [ 0.013316] node 0: [mem 0x00000000bdb1b000-0x00000000beb9afff] [ 0.013317] node 0: [mem 0x00000000bebff000-0x00000000bffdffff] [ 0.013318] node 0: [mem 0x0000000100000000-0x000000037fffffff] [ 0.013326] Zeroed struct page in unavailable ranges: 314 pages [ 0.013327] Initmem setup node 0 [mem 0x0000000000001000-0x000000037fffffff] [ 0.013329] On node 0 totalpages: 3407558 [ 0.013330] DMA zone: 64 pages used for memmap [ 0.013330] DMA zone: 3119 pages reserved [ 0.013331] DMA zone: 3980 pages, LIFO batch:0 [ 0.013373] DMA32 zone: 12221 pages used for memmap [ 0.013374] DMA32 zone: 782138 pages, LIFO batch:63 [ 0.022285] Normal zone: 40960 pages used for memmap [ 0.022286] Normal zone: 2621440 pages, LIFO batch:63 [ 0.051544] ACPI: Local APIC address 0xfee00000 [ 0.051552] ACPI: LAPIC_NMI (acpi_id[0xff] dfl dfl lint[0x1]) [ 0.051619] IOAPIC[0]: apic_id 0, version 17, address 0xfec00000, GSI 0-23 [ 0.051622] ACPI: INT_SRC_OVR (bus 0 bus_irq 5 global_irq 5 high level) [ 0.051623] ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level) [ 0.051624] ACPI: INT_SRC_OVR (bus 0 bus_irq 10 global_irq 10 high level) [ 0.051625] ACPI: INT_SRC_OVR (bus 0 bus_irq 11 global_irq 11 high level) [ 0.051627] ACPI: IRQ5 used by override. [ 0.051627] ACPI: IRQ9 used by override. [ 0.051628] ACPI: IRQ10 used by override. [ 0.051628] ACPI: IRQ11 used by override. [ 0.051630] Using ACPI (MADT) for SMP configuration information [ 0.051632] smpboot: Allowing 2 CPUs, 0 hotplug CPUs [ 0.051649] [mem 0xc0000000-0xffffffff] available for PCI devices [ 0.051650] Booting paravirtualized kernel on KVM [ 0.051653] clocksource: refined-jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1910969940391419 ns [ 0.103790] random: get_random_bytes called from start_kernel+0x8a/0x427 with crng_init=0 [ 0.103797] setup_percpu: NR_CPUS:256 nr_cpumask_bits:256 nr_cpu_ids:2 nr_node_ids:1 [ 0.104051] percpu: Embedded 44 pages/cpu s143360 r8192 d28672 u1048576 [ 0.104055] pcpu-alloc: s143360 r8192 d28672 u1048576 alloc=1*2097152 [ 0.104056] pcpu-alloc: [0] 0 1 [ 0.104072] PV qspinlock hash table entries: 256 (order: 0, 4096 bytes) [ 0.104076] Built 1 zonelists, mobility grouping on. Total pages: 3351194 [ 0.104077] Policy zone: Normal [ 0.104080] Kernel command line: BOOT_IMAGE=/syslinux/vmlinuz.A init=/usr/lib/systemd/systemd boot=local rootwait ro noresume noswap loglevel=7 noinitrd console=ttyS0 security=apparmor virtio_net.napi_tx=1 systemd.unified_cgroup_hierarchy=false systemd.legacy_systemd_cgroup_controller=false csm.disabled=1 dm_verity.error_behavior=3 dm_verity.max_bios=-1 dm_verity.dev_wait=1 i915.modeset=1 cros_efi loadpin.enabled=0 module.sig_enforce=0 root=/dev/dm-0 "dm=1 vroot none ro 1,0 4077568 verity payload=PARTUUID=555BDB75-CBD7-CD4A-B24E-29B13D7AC0DF hashtree=PARTUUID=555BDB75-CBD7-CD4A-B24E-29B13D7AC0DF hashstart=4077568 alg=sha256 root_hexdigest=42104d547ac104fb7061529e78f53e4f3e8c3d3cbb040dc6e0f84aad68491347 salt=9dc7f3acc4e2ce65be16356e960c2b21b51a917fa31d2e891fd295490c991e41" mitigations=off [ 0.138443] Memory: 13282552K/13630232K available (10252K kernel code, 1051K rwdata, 2116K rodata, 1448K init, 3496K bss, 347680K reserved, 0K cma-reserved) [ 0.138517] SLUB: HWalign=64, Order=0-3, MinObjects=0, CPUs=2, Nodes=1 [ 0.138526] ftrace: allocating 28120 entries in 110 pages [ 0.147389] rcu: Hierarchical RCU implementation. [ 0.147391] rcu: RCU restricting CPUs from NR_CPUS=256 to nr_cpu_ids=2. [ 0.147392] rcu: Adjusting geometry for rcu_fanout_leaf=16, nr_cpu_ids=2 [ 0.148710] NR_IRQS: 16640, nr_irqs: 440, preallocated irqs: 16 [ 0.273201] console [ttyS0] enabled [ 0.273743] ACPI: Core revision 20180810 [ 0.274441] APIC: Switch to symmetric I/O mode setup [ 0.275965] x2apic enabled [ 0.277388] Switched APIC routing to physical x2apic. [ 0.280973] ..TIMER: vector=0x30 apic1=0 pin1=0 apic2=-1 pin2=-1 [ 0.282097] clocksource: tsc-early: mask: 0xffffffffffffffff max_cycles: 0x1fb633008a4, max_idle_ns: 440795292230 ns [ 0.283581] Calibrating delay loop (skipped) preset value.. 4400.00 BogoMIPS (lpj=2200000) [ 0.284582] pid_max: default: 32768 minimum: 301 [ 0.285898] Security Framework initialized [ 0.286582] Yama: becoming mindful. [ 0.287186] LoadPin: ready to pin (currently disabled) [ 0.287187] Chromium OS LSM: enabled [ 0.287603] AppArmor: AppArmor initialized [ 0.292145] Dentry cache hash table entries: 2097152 (order: 12, 16777216 bytes) [ 0.294816] Inode-cache hash table entries: 1048576 (order: 11, 8388608 bytes) [ 0.295629] Mount-cache hash table entries: 32768 (order: 6, 262144 bytes) [ 0.296617] Mountpoint-cache hash table entries: 32768 (order: 6, 262144 bytes) [ 0.297801] Last level iTLB entries: 4KB 64, 2MB 8, 4MB 8 [ 0.298582] Last level dTLB entries: 4KB 64, 2MB 0, 4MB 0, 1GB 4 [ 0.299587] Speculative Store Bypass: Vulnerable [ 0.305300] Freeing SMP alternatives memory: 40K [ 0.512026] smpboot: CPU0: Intel(R) Xeon(R) CPU @ 2.20GHz (family: 0x6, model: 0x4f, stepping: 0x0) [ 0.512676] Performance Events: unsupported p6 CPU model 79 no PMU driver, software events only. [ 0.513642] rcu: Hierarchical SRCU implementation. [ 0.514678] random: crng done (trusting CPU's manufacturer) [ 0.515586] NMI watchdog: Perf NMI watchdog permanently disabled [ 0.516628] smp: Bringing up secondary CPUs ... [ 0.517662] x86: Booting SMP configuration: [ 0.518418] .... node #0, CPUs: #1 [ 0.129721] kvm-clock: cpu 1, msr 141600041, secondary cpu clock [ 0.519842] smp: Brought up 1 node, 2 CPUs [ 0.521582] smpboot: Max logical packages: 1 [ 0.522237] smpboot: Total of 2 processors activated (8800.00 BogoMIPS) [ 0.523696] devtmpfs: initialized [ 0.524610] PM: Registering ACPI NVS region [mem 0xbd95a000-0xbd95afff] (4096 bytes) [ 0.525586] PM: Registering ACPI NVS region [mem 0xbdadd000-0xbdae4fff] (32768 bytes) [ 0.526582] PM: Registering ACPI NVS region [mem 0xbebfb000-0xbebfefff] (16384 bytes) [ 0.527625] clocksource: jiffies: mask: 0xffffffff max_cycles: 0xffffffff, max_idle_ns: 1911260446275000 ns [ 0.529586] futex hash table entries: 512 (order: 3, 32768 bytes) [ 0.530688] RTC time: 16:45:17, date: 01/05/21 [ 0.531946] NET: Registered protocol family 16 [ 0.532657] audit: initializing netlink subsys (disabled) [ 0.533597] audit: type=2000 audit(1609865117.373:1): state=initialized audit_enabled=0 res=1 [ 0.534586] cpuidle: using governor ladder [ 0.535598] cpuidle: using governor menu [ 0.536622] ACPI: bus type PCI registered [ 0.537662] PCI: Using configuration type 1 for base access [ 0.539176] HugeTLB registered 2.00 MiB page size, pre-allocated 0 pages [ 0.540620] ACPI: Added _OSI(Module Device) [ 0.540620] ACPI: Added _OSI(Processor Device) [ 0.541582] ACPI: Added _OSI(3.0 _SCP Extensions) [ 0.542291] ACPI: Added _OSI(Processor Aggregator Device) [ 0.542582] ACPI: Added _OSI(Linux-Dell-Video) [ 0.543582] ACPI: Added _OSI(Linux-Lenovo-NV-HDMI-Audio) [ 0.545586] ACPI: 3 ACPI AML tables successfully acquired and loaded [ 0.547587] ACPI: Interpreter enabled [ 0.547591] ACPI: (supports S0 S3 S5) [ 0.548582] ACPI: Using IOAPIC for interrupt routing [ 0.549593] PCI: Using host bridge windows from ACPI; if necessary, use "pci=nocrs" and report a bug [ 0.551791] ACPI: Enabled 16 GPEs in block 00 to 0F [ 0.554284] ACPI: PCI Root Bridge [PCI0] (domain 0000 [bus 00-ff]) [ 0.554585] acpi PNP0A03:00: _OSC: OS supports [ASPM ClockPM Segments MSI] [ 0.556585] acpi PNP0A03:00: _OSC failed (AE_NOT_FOUND); disabling ASPM [ 0.557587] acpi PNP0A03:00: fail to add MMCONFIG information, can't access extended PCI configuration space under this bridge. [ 0.559699] PCI host bridge to bus 0000:00 [ 0.560582] pci_bus 0000:00: root bus resource [io 0x0000-0x0cf7 window] [ 0.561585] pci_bus 0000:00: root bus resource [io 0x0d00-0xffff window] [ 0.562582] pci_bus 0000:00: root bus resource [mem 0x000a0000-0x000bffff window] [ 0.563583] pci_bus 0000:00: root bus resource [mem 0xc0000000-0xc53fffff window] [ 0.565584] pci_bus 0000:00: root bus resource [mem 0xc5440000-0xee3fffff window] [ 0.566582] pci_bus 0000:00: root bus resource [mem 0xee440000-0xfebfefff window] [ 0.567582] pci_bus 0000:00: root bus resource [mem 0x380000000-0x2037fffffff window] [ 0.569584] pci_bus 0000:00: root bus resource [bus 00-ff] [ 0.569612] pci 0000:00:00.0: [8086:1237] type 00 class 0x060000 [ 0.573012] pci 0000:00:01.0: [8086:7110] type 00 class 0x060100 [ 0.584783] pci 0000:00:01.3: [8086:7113] type 00 class 0x068000 [ 0.595618] pci 0000:00:01.3: quirk: [io 0xb000-0xb03f] claimed by PIIX4 ACPI [ 0.597796] pci 0000:00:03.0: [1af4:1004] type 00 class 0x000000 [ 0.601585] pci 0000:00:03.0: reg 0x10: [io 0xc040-0xc07f] [ 0.605586] pci 0000:00:03.0: reg 0x14: [mem 0xc1001000-0xc100107f] [ 0.614835] pci 0000:00:04.0: [10de:1eb8] type 00 class 0x030200 [ 0.623585] pci 0000:00:04.0: reg 0x10: [mem 0xc0000000-0xc0ffffff] [ 0.628585] pci 0000:00:04.0: reg 0x14: [mem 0x380000000-0x38fffffff 64bit pref] [ 0.632585] pci 0000:00:04.0: reg 0x1c: [mem 0x390000000-0x391ffffff 64bit pref] [ 0.639811] pci 0000:00:05.0: [1af4:1000] type 00 class 0x020000 [ 0.644382] pci 0000:00:05.0: reg 0x10: [io 0xc000-0xc03f] [ 0.647583] pci 0000:00:05.0: reg 0x14: [mem 0xc1000000-0xc100007f] [ 0.656827] pci 0000:00:06.0: [1af4:1005] type 00 class 0x00ff00 [ 0.661584] pci 0000:00:06.0: reg 0x10: [io 0xc080-0xc09f] [ 0.663583] pci 0000:00:06.0: reg 0x14: [mem 0xc1002000-0xc100203f] [ 0.674586] ACPI: PCI Interrupt Link [LNKA] (IRQs 5 *10 11) [ 0.675639] ACPI: PCI Interrupt Link [LNKB] (IRQs 5 *10 11) [ 0.677636] ACPI: PCI Interrupt Link [LNKC] (IRQs 5 10 *11) [ 0.679582] ACPI: PCI Interrupt Link [LNKD] (IRQs 5 10 *11) [ 0.681261] ACPI: PCI Interrupt Link [LNKS] (IRQs *9) [ 0.682718] SCSI subsystem initialized [ 0.683614] libata version 3.00 loaded. [ 0.683662] Registered efivars operations [ 0.683662] PCI: Using ACPI for IRQ routing [ 0.684583] PCI: pci_cache_line_size set to 64 bytes [ 0.684698] e820: reserve RAM buffer [mem 0x00055000-0x0005ffff] [ 0.684700] e820: reserve RAM buffer [mem 0x00098000-0x0009ffff] [ 0.684701] e820: reserve RAM buffer [mem 0xbd957000-0xbfffffff] [ 0.684702] e820: reserve RAM buffer [mem 0xbdadd000-0xbfffffff] [ 0.684703] e820: reserve RAM buffer [mem 0xbeb9b000-0xbfffffff] [ 0.684704] e820: reserve RAM buffer [mem 0xbffe0000-0xbfffffff] [ 0.685603] clocksource: Switched to clocksource kvm-clock [ 0.695777] VFS: Disk quotas dquot_6.6.0 [ 0.696512] VFS: Dquot-cache hash table entries: 512 (order 0, 4096 bytes) [ 0.697831] AppArmor: AppArmor Filesystem Enabled [ 0.698549] pnp: PnP ACPI init [ 0.699237] pnp 00:00: Plug and Play ACPI device, IDs PNP0b00 (active) [ 0.699292] pnp 00:01: Plug and Play ACPI device, IDs PNP0303 (active) [ 0.699325] pnp 00:02: Plug and Play ACPI device, IDs PNP0f13 (active) [ 0.699366] pnp 00:03: Plug and Play ACPI device, IDs PNP0501 (active) [ 0.699397] pnp 00:04: Plug and Play ACPI device, IDs PNP0501 (active) [ 0.699428] pnp 00:05: Plug and Play ACPI device, IDs PNP0501 (active) [ 0.699459] pnp 00:06: Plug and Play ACPI device, IDs PNP0501 (active) [ 0.699642] pnp: PnP ACPI: found 7 devices [ 0.700864] pci_bus 0000:00: resource 4 [io 0x0000-0x0cf7 window] [ 0.700866] pci_bus 0000:00: resource 5 [io 0x0d00-0xffff window] [ 0.700867] pci_bus 0000:00: resource 6 [mem 0x000a0000-0x000bffff window] [ 0.700886] pci_bus 0000:00: resource 7 [mem 0xc0000000-0xc53fffff window] [ 0.700887] pci_bus 0000:00: resource 8 [mem 0xc5440000-0xee3fffff window] [ 0.700888] pci_bus 0000:00: resource 9 [mem 0xee440000-0xfebfefff window] [ 0.700889] pci_bus 0000:00: resource 10 [mem 0x380000000-0x2037fffffff window] [ 0.700944] NET: Registered protocol family 2 [ 0.701854] tcp_listen_portaddr_hash hash table entries: 8192 (order: 6, 327680 bytes) [ 0.703183] TCP established hash table entries: 131072 (order: 8, 1048576 bytes) [ 0.704374] TCP bind hash table entries: 65536 (order: 9, 2097152 bytes) [ 0.705653] TCP: Hash tables configured (established 131072 bind 65536) [ 0.706801] UDP hash table entries: 8192 (order: 7, 786432 bytes) [ 0.707849] UDP-Lite hash table entries: 8192 (order: 7, 786432 bytes) [ 0.709030] NET: Registered protocol family 1 [ 0.709987] pci 0000:00:00.0: Limiting direct PCI/PCI transfers [ 0.711142] PCI: CLS 0 bytes, default 64 [ 0.711169] PCI-DMA: Using software bounce buffering for IO (SWIOTLB) [ 0.712099] software IO TLB: mapped [mem 0xb6ff7000-0xbaff7000] (64MB) [ 0.713214] RAPL PMU: API unit is 2^-32 Joules, 3 fixed counters, 10737418240 ms ovfl timer [ 0.714939] RAPL PMU: hw unit of domain pp0-core 2^-0 Joules [ 0.716074] RAPL PMU: hw unit of domain package 2^-0 Joules [ 0.717010] RAPL PMU: hw unit of domain dram 2^-16 Joules [ 0.717974] clocksource: tsc: mask: 0xffffffffffffffff max_cycles: 0x1fb633008a4, max_idle_ns: 440795292230 ns [ 0.719696] Scanning for low memory corruption every 60 seconds [ 0.721182] Initialise system trusted keyrings [ 0.721972] workingset: timestamp_bits=40 max_order=22 bucket_order=0 [ 0.724734] Key type asymmetric registered [ 0.725435] Asymmetric key parser 'x509' registered [ 0.726416] Block layer SCSI generic (bsg) driver version 0.4 loaded (major 250) [ 0.728078] io scheduler noop registered (default) [ 0.729039] io scheduler deadline registered [ 0.729860] io scheduler cfq registered [ 0.730502] io scheduler mq-deadline registered [ 0.731460] intel_idle: Please enable MWAIT in BIOS SETUP [ 0.731500] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input0 [ 0.732686] ACPI: Power Button [PWRF] [ 0.733373] input: Sleep Button as /devices/LNXSYSTM:00/LNXSLPBN:00/input/input1 [ 0.734568] ACPI: Sleep Button [SLPF] [ 0.737037] PCI Interrupt Link [LNKC] enabled at IRQ 11 [ 0.738031] virtio-pci 0000:00:03.0: virtio_pci: leaving for legacy driver [ 0.741977] PCI Interrupt Link [LNKA] enabled at IRQ 10 [ 0.743027] virtio-pci 0000:00:05.0: virtio_pci: leaving for legacy driver [ 0.746687] PCI Interrupt Link [LNKB] enabled at IRQ 10 [ 0.747747] virtio-pci 0000:00:06.0: virtio_pci: leaving for legacy driver [ 0.750391] Serial: 8250/16550 driver, 4 ports, IRQ sharing disabled [ 0.773171] 00:03: ttyS0 at I/O 0x3f8 (irq = 4, base_baud = 115200) is a 16550A [ 0.796269] 00:04: ttyS1 at I/O 0x2f8 (irq = 3, base_baud = 115200) is a 16550A [ 0.819139] 00:05: ttyS2 at I/O 0x3e8 (irq = 6, base_baud = 115200) is a 16550A [ 0.842134] 00:06: ttyS3 at I/O 0x2e8 (irq = 7, base_baud = 115200) is a 16550A [ 0.843719] Non-volatile memory driver v1.3 [ 0.860193] tpm_tis MSFT0101:00: 2.0 TPM (device-id 0x9009, rev-id 0) [ 0.868004] loop: module loaded [ 0.875751] scsi host0: Virtio SCSI HBA [ 0.883917] scsi 0:0:1:0: Direct-Access Google PersistentDisk 1 PQ: 0 ANSI: 6 [ 0.914454] sd 0:0:1:0: [sda] 167772160 512-byte logical blocks: (85.9 GB/80.0 GiB) [ 0.923208] sd 0:0:1:0: [sda] 4096-byte physical blocks [ 0.925807] sd 0:0:1:0: [sda] Write Protect is off [ 0.928164] sd 0:0:1:0: [sda] Mode Sense: 1f 00 00 08 [ 0.929682] sd 0:0:1:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA [ 0.934175] i8042: PNP: PS/2 Controller [PNP0303:KBD,PNP0f13:MOU] at 0x60,0x64 irq 1,12 [ 0.936473] i8042: Warning: Keylock active [ 0.938130] serio: i8042 KBD port at 0x60,0x64 irq 1 [ 0.939126] serio: i8042 AUX port at 0x60,0x64 irq 12 [ 0.940271] rtc_cmos 00:00: RTC can wake from S4 [ 0.941500] rtc_cmos 00:00: registered as rtc0 [ 0.942407] rtc_cmos 00:00: alarms up to one day, 114 bytes nvram [ 0.943509] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11 [ 0.944566] iTCO_vendor_support: vendor-support=0 [ 0.945452] device-mapper: ioctl: 4.39.0-ioctl (2018-04-03) initialised: [email protected] [ 0.946984] EFI Variables Facility v0.08 2004-May-17 [ 0.948183] sda: sda1 sda2 sda3 sda4 sda5 sda6 sda7 sda8 sda9 sda10 sda11 sda12 [ 0.950437] sd 0:0:1:0: [sda] Attached SCSI disk [ 0.953322] pstore: Registered efi as persistent store backend [ 0.954714] Mirror/redirect action on [ 0.955655] Initializing XFRM netlink socket [ 0.956499] NET: Registered protocol family 10 [ 0.957479] Segment Routing with IPv6 [ 0.958062] NET: Registered protocol family 17 [ 0.958782] bridge: filtering via arp/ip/ip6tables is no longer available by default. Update your scripts to load br_netfilter if you need this. [ 0.961132] NET: Registered protocol family 40 [ 0.961827] sched_clock: Marking stable (833060960, 128721056)->(1018022263, -56240247) [ 0.963484] registered taskstats version 1 [ 0.964204] Loading compiled-in X.509 certificates [ 0.965527] Loaded X.509 cert 'Google LLC: Container-Optimized OS kernel signing key: 6dd4671f9dd979a2c159e9043f41035994fbd983' [ 0.967804] Loaded X.509 cert 'Google: Container-Optimized OS: 2efea493fc148b35859d0bc8c3dd71a139fc7ab411283e06f6c10b5a6b17e355' [ 0.969655] pstore: Using compression: deflate [ 0.970487] AppArmor: AppArmor sha1 policy hashing enabled [ 0.971434] ima: Allocated hash algorithm: sha256 [ 1.000085] Magic number: 5:277:792 [ 1.000979] hash matches ../../../../../tmp/portage/sys-kernel/lakitu-kernel-4_19-4.19.112-r522/work/lakitu-kernel-4_19-4.19.112/drivers/base/power/main.c:1043 [ 1.003646] device-mapper: init: waiting for all devices to be available before creating mapped devices

[ 1.006064] device-mapper: ioctl: dm-0 (vroot) is ready [ 1.007491] md: Waiting for all devices to be available before autodetect [ 1.008690] md: If you don't use raid, use raid=noautodetect [ 1.009697] md: Autodetecting RAID arrays. [ 1.010354] md: autorun ... [ 1.011044] md: ... autorun DONE. [ 1.014581] EXT4-fs (dm-0): mounting ext2 file system using the ext4 subsystem [ 1.018585] EXT4-fs (dm-0): mounted filesystem without journal. Opts: (null) [ 1.019940] VFS: Mounted root (ext2 filesystem) readonly on device 253:0. [ 1.025450] devtmpfs: mounted [ 1.234835] Freeing unused decrypted memory: 2040K [ 1.383127] Freeing unused kernel image memory: 1448K [ 1.385601] Write protecting the kernel read-only data: 16384k [ 1.387644] Freeing unused kernel image memory: 2004K [ 1.388927] Freeing unused kernel image memory: 1980K [ 1.389659] Run /usr/lib/systemd/systemd as init process [ 1.550981] systemd[1]: systemd 239 running in system mode. (+PAM +AUDIT -SELINUX +IMA +APPARMOR +SMACK -SYSVINIT -UTMP -LIBCRYPTSETUP +GCRYPT -GNUTLS -ACL -XZ +LZ4 +SECCOMP +BLKID -ELFUTILS +KMOD -IDN2 -IDN -PCRE2 default-hierarchy=hybrid) [ 1.555067] systemd[1]: Detected virtualization kvm. [ 1.556284] systemd[1]: Detected architecture x86-64. [ 1.568155] systemd[1]: Initializing machine ID from KVM UUID. [ 1.569218] systemd[1]: Installed transient /etc/machine-id file. [ 1.674593] systemd[1]: File /usr/lib/systemd/system/systemd-journald.service:36 configures an IP firewall (IPAddressDeny=any), but the local system does not support BPF/cgroup based firewalling. [ 1.677956] systemd[1]: Proceeding WITHOUT firewalling in effect! (This warning is only shown for the first loaded unit using IP firewalling.) [ 1.798702] systemd[1]: Listening on Journal Socket. [ 1.802710] systemd[1]: Listening on udev Control Socket. [ 1.806698] systemd[1]: Listening on Journal Socket (/dev/log). [ 2.433965] LoadPin: dev(253,0): read-only [ 2.434792] LoadPin: load pinning engaged. [ 2.435794] LoadPin: kernel-module pinned obj="/lib/modules/4.19.112+/kernel/arch/x86/crypto/glue_helper.ko" pid=124 cmdline="/usr/lib/systemd/systemd-udevd" [ 2.442064] cryptd: max_cpu_qlen set to 1000 [ 2.458599] AVX2 version of gcm_enc/dec engaged. [ 2.459468] AES CTR mode by8 optimization enabled [ 2.607902] EXT4-fs (sda8): mounted filesystem with ordered data mode. Opts: (null) [ 2.655882] EXT4-fs (sda1): mounted filesystem with ordered data mode. Opts: commit=30 [ 2.763062] EXT4-fs (sda1): re-mounted. Opts: commit=30 [ 2.767280] EXT4-fs (sda1): re-mounted. Opts: commit=30 [ 2.769537] EXT4-fs (sda1): re-mounted. Opts: commit=30 [ 2.773213] EXT4-fs (sda1): re-mounted. Opts: commit=30 [ 2.802668] systemd-journald[104]: Received request to flush runtime journal from PID 1 [ 6.590085] Bridge firewalling registered [ 6.781281] IPv6: ADDRCONF(NETDEV_UP): docker0: link is not ready [ 12.731714] HTB: quantum of class 10001 is big. Consider r2q change. [ 12.744551] u32 classifier [ 12.747841] Actions configured [ 13.206780] IPv6: ADDRCONF(NETDEV_UP): br0: link is not ready [ 13.272564] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null) [ 13.815726] EXT4-fs (sda1): re-mounted. Opts: commit=30 [ 15.103170] LoadPin: kernel-module pinning-ignored obj="/usr/local/nvidia/drivers/nvidia.ko" pid=847 cmdline="insmod /usr/local/nvidia/drivers/nvidia.ko" [ 15.154133] nvidia: loading out-of-tree module taints kernel. [ 15.160031] nvidia: module license 'NVIDIA' taints kernel. [ 15.165627] Disabling lock debugging due to kernel taint [ 15.177153] nvidia: module verification failed: signature and/or required key missing - tainting kernel [ 15.192196] nvidia-nvlink: Nvlink Core is being initialized, major device number 248 [ 15.203974] PCI Interrupt Link [LNKD] enabled at IRQ 11 [ 15.209567] NVRM: loading NVIDIA UNIX x86_64 Kernel Module 418.67 Sat Apr 6 03:07:24 CDT 2019 [ 15.221327] LoadPin: kernel-module pinning-ignored obj="/usr/local/nvidia/drivers/nvidia-uvm.ko" pid=855 cmdline="insmod /usr/local/nvidia/drivers/nvidia-uvm.ko" [ 15.266972] nvidia-uvm: Loaded the UVM driver in 8 mode, major device number 246 [ 15.277064] LoadPin: kernel-module pinning-ignored obj="/usr/local/nvidia/drivers/nvidia-drm.ko" pid=866 cmdline="insmod /usr/local/nvidia/drivers/nvidia-drm.ko" [ 18.075180] br0: port 1(veth14420fb) entered blocking state [ 18.080996] br0: port 1(veth14420fb) entered disabled state [ 18.086850] device veth14420fb entered promiscuous mode [ 18.112368] IPv6: ADDRCONF(NETDEV_UP): veth14420fb: link is not ready [ 18.421531] br0: port 2(veth965bf25) entered blocking state [ 18.427357] br0: port 2(veth965bf25) entered disabled state [ 18.433137] device veth965bf25 entered promiscuous mode [ 18.450728] IPv6: ADDRCONF(NETDEV_UP): veth965bf25: link is not ready [ 18.457322] br0: port 2(veth965bf25) entered blocking state [ 18.463034] br0: port 2(veth965bf25) entered forwarding state [ 18.469112] br0: port 2(veth965bf25) entered disabled state [ 19.148285] eth0: renamed from veth4d77e1e [ 19.155016] IPv6: ADDRCONF(NETDEV_CHANGE): veth14420fb: link becomes ready [ 19.162071] br0: port 1(veth14420fb) entered blocking state [ 19.167763] br0: port 1(veth14420fb) entered forwarding state [ 19.174334] IPv6: ADDRCONF(NETDEV_CHANGE): br0: link becomes ready [ 19.189365] eth0: renamed from vethb23403c [ 19.195962] IPv6: ADDRCONF(NETDEV_CHANGE): veth965bf25: link becomes ready [ 19.203000] br0: port 2(veth965bf25) entered blocking state [ 19.208713] br0: port 2(veth965bf25) entered forwarding state [ 128.942642] NVRM: client does not support versioning!! [ 2569.806197] python[19701]: segfault at 10 ip 00007f7268976bee sp 00007ffdc109be50 error 4 in libCorradeUtility.so.2[7f726893a000+64000] [ 2569.818619] Code: e8 97 fe ff ff 48 39 43 20 74 09 48 8b 40 08 5b c3 0f 1f 00 31 c0 5b c3 0f 1f 40 00 41 57 41 56 41 55 41 54 55 53 48 83 ec 08 <48> 8b 1f 4c 8b 77 08 4c 39 f3 74 52 4c 8b 2e 41 89 d7 45 31 e4 49 [ 2569.838660] python[19700]: segfault at 10 ip 00007f7268976bee sp 00007ffdc109be50 error 4 in libCorradeUtility.so.2[7f726893a000+64000] [ 2569.851033] Code: e8 97 fe ff ff 48 39 43 20 74 09 48 8b 40 08 5b c3 0f 1f 00 31 c0 5b c3 0f 1f 40 00 41 57 41 56 41 55 41 54 55 53 48 83 ec 08 <48> 8b 1f 4c 8b 77 08 4c 39 f3 74 52 4c 8b 2e 41 89 d7 45 31 e4 49 [ 2569.904684] python[19699]: segfault at 10 ip 00007f7268976bee sp 00007ffdc109be50 error 4 in libCorradeUtility.so.2[7f726893a000+64000] [ 2569.917021] Code: e8 97 fe ff ff 48 39 43 20 74 09 48 8b 40 08 5b c3 0f 1f 00 31 c0 5b c3 0f 1f 40 00 41 57 41 56 41 55 41 54 55 53 48 83 ec 08 <48> 8b 1f 4c 8b 77 08 4c 39 f3 74 52 4c 8b 2e 41 89 d7 45 31 e4 49 [ 2570.040635] python[19702]: segfault at 10 ip 00007f7268976bee sp 00007ffdc109be50 error 4 in libCorradeUtility.so.2[7f726893a000+64000] [ 2570.052970] Code: e8 97 fe ff ff 48 39 43 20 74 09 48 8b 40 08 5b c3 0f 1f 00 31 c0 5b c3 0f 1f 40 00 41 57 41 56 41 55 41 54 55 53 48 83 ec 08 <48> 8b 1f 4c 8b 77 08 4c 39 f3 74 52 4c 8b 2e 41 89 d7 45 31 e4 49`

KenaHemnani avatar Jan 05 '21 17:01 KenaHemnani

I am running this on Google Colab

KenaHemnani avatar Jan 05 '21 17:01 KenaHemnani

I am not sure what is going on. Does the RL training part of this colab work: https://colab.research.google.com/github/facebookresearch/habitat-lab/blob/master/examples/tutorials/colabs/Habitat_Lab.ipynb ? Also make sure you are requesting a GPU instance.

erikwijmans avatar Jan 05 '21 18:01 erikwijmans

Yes, @erikwijmans this works fine. Yes, the GPU runtime is selected.

KenaHemnani avatar Jan 05 '21 19:01 KenaHemnani

Hmm, odd. Maybe there is something with !python in colab then. run.py is pretty simple and you even grab run_exp from it, so maybe things will work if you just call into that function directly?

erikwijmans avatar Jan 05 '21 19:01 erikwijmans

Hmm, odd. Maybe there is something with !python in colab then. run.py is pretty simple and you even grab run_exp from it, so maybe things will work if you just call into that function directly?

@erikwijmans , sorry Sir I didn't understand what you are asking.

KenaHemnani avatar Jan 05 '21 19:01 KenaHemnani

It looks like you are calling into habitat-baselines via

!python -u habitat_baselines/run.py --exp-config habitat_baselines/config/pointnav/ppo_pointnav_example.yaml --run-type train

The example colab doesn't do that, it just invokes the trainers directly and that seems to work. You can also invoke what run.py does directly via the run_exp function (https://github.com/facebookresearch/habitat-lab/blob/master/habitat_baselines/run.py#L63). Can you try doing that and see if that fixes it?

erikwijmans avatar Jan 05 '21 19:01 erikwijmans

@erikwijmans yes, I was doing that. I tried with changing config=get_baselines_config( "./habitat_baselines/config/pointnav/ppo_pointnav.yaml" ) Under the RL Training section of Habitat-lab colab: But, it is giving following error: `AssertionError Traceback (most recent call last)

in () 2 trainer_init = baseline_registry.get_trainer(config.TRAINER_NAME) 3 trainer = trainer_init(config) ----> 4 trainer.train()

3 frames

/content/habitat-lab/habitat/datasets/pointnav/pointnav_dataset.py in get_scenes_to_load(cls, config) 41 episodes. 42 """ ---> 43 assert cls.check_config_paths_exist(config) 44 dataset_dir = os.path.dirname( 45 config.DATA_PATH.format(split=config.SPLIT)

AssertionError:

`

KenaHemnani avatar Jan 06 '21 08:01 KenaHemnani