deep-high-resolution-net.pytorch icon indicating copy to clipboard operation
deep-high-resolution-net.pytorch copied to clipboard

Segmentation fault (core dumped) after main() returned in tools/test.py

Open WaiTsun-Yeung opened this issue 4 years ago • 3 comments

I am currently running the following sample script on a Ubuntu 18.04.4 LTS AMD Ryzen 3920x machine with 24 physical cores and 3 GPUs, all detected by pytorch:

python3 tools/test.py --cfg experiments/mpii/hrnet/w32_256x256_adam_lr1e-3.yaml TEST.MODEL_FILE models/pytorch/pose_mpii/pose_hrnet_w32_256x256.pth

I have adjusted the configuration file to test with a batch size of 8 per GPU so it consumes less memory, and here is the output:

=> creating output/mpii/pose_hrnet/w32_256x256_adam_lr1e-3
=> creating log/mpii/pose_hrnet/w32_256x256_adam_lr1e-3_2021-08-24-18-48
Namespace(cfg='experiments/mpii/hrnet/w32_256x256_adam_lr1e-3.yaml', dataDir='', logDir='', modelDir='', opts=['TEST.MODEL_FILE', 'models/pytorch/pose_mpii/pose_hrnet_w32_256x256.pth'], prevModelDir='')
AUTO_RESUME: True
CUDNN:
  BENCHMARK: True
  DETERMINISTIC: False
  ENABLED: True
DATASET:
  COLOR_RGB: True
  DATASET: mpii
  DATA_FORMAT: jpg
  FLIP: True
  HYBRID_JOINTS_TYPE: 
  NUM_JOINTS_HALF_BODY: 8
  PROB_HALF_BODY: -1.0
  ROOT: data/mpii/
  ROT_FACTOR: 30
  SCALE_FACTOR: 0.25
  SELECT_DATA: False
  TEST_SET: valid
  TRAIN_SET: train
DATA_DIR: 
DEBUG:
  DEBUG: True
  SAVE_BATCH_IMAGES_GT: True
  SAVE_BATCH_IMAGES_PRED: True
  SAVE_HEATMAPS_GT: True
  SAVE_HEATMAPS_PRED: True
GPUS: (0, 1, 2)
LOG_DIR: log
LOSS:
  TOPK: 8
  USE_DIFFERENT_JOINTS_WEIGHT: False
  USE_OHKM: False
  USE_TARGET_WEIGHT: True
MODEL:
  EXTRA:
    FINAL_CONV_KERNEL: 1
    PRETRAINED_LAYERS: ['conv1', 'bn1', 'conv2', 'bn2', 'layer1', 'transition1', 'stage2', 'transition2', 'stage3', 'transition3', 'stage4']
    STAGE2:
      BLOCK: BASIC
      FUSE_METHOD: SUM
      NUM_BLOCKS: [4, 4]
      NUM_BRANCHES: 2
      NUM_CHANNELS: [32, 64]
      NUM_MODULES: 1
    STAGE3:
      BLOCK: BASIC
      FUSE_METHOD: SUM
      NUM_BLOCKS: [4, 4, 4]
      NUM_BRANCHES: 3
      NUM_CHANNELS: [32, 64, 128]
      NUM_MODULES: 4
    STAGE4:
      BLOCK: BASIC
      FUSE_METHOD: SUM
      NUM_BLOCKS: [4, 4, 4, 4]
      NUM_BRANCHES: 4
      NUM_CHANNELS: [32, 64, 128, 256]
      NUM_MODULES: 3
  HEATMAP_SIZE: [64, 64]
  IMAGE_SIZE: [256, 256]
  INIT_WEIGHTS: True
  NAME: pose_hrnet
  NUM_JOINTS: 16
  PRETRAINED: models/pytorch/imagenet/hrnet_w32-36af842e.pth
  SIGMA: 2
  TAG_PER_JOINT: True
  TARGET_TYPE: gaussian
OUTPUT_DIR: output
PIN_MEMORY: True
PRINT_FREQ: 100
RANK: 0
TEST:
  BATCH_SIZE_PER_GPU: 8
  BBOX_THRE: 1.0
  COCO_BBOX_FILE: 
  FLIP_TEST: True
  IMAGE_THRE: 0.1
  IN_VIS_THRE: 0.0
  MODEL_FILE: models/pytorch/pose_mpii/pose_hrnet_w32_256x256.pth
  NMS_THRE: 0.6
  OKS_THRE: 0.5
  POST_PROCESS: True
  SHIFT_HEATMAP: True
  SOFT_NMS: False
  USE_GT_BBOX: False
TRAIN:
  BATCH_SIZE_PER_GPU: 8
  BEGIN_EPOCH: 0
  CHECKPOINT: 
  END_EPOCH: 210
  GAMMA1: 0.99
  GAMMA2: 0.0
  LR: 0.001
  LR_FACTOR: 0.1
  LR_STEP: [170, 200]
  MOMENTUM: 0.9
  NESTEROV: False
  OPTIMIZER: adam
  RESUME: False
  SHUFFLE: True
  WD: 0.0001
WORKERS: 24
=> loading model from models/pytorch/pose_mpii/pose_hrnet_w32_256x256.pth
/home/aaron/anaconda3/envs/pytorch_env/lib/python3.8/site-packages/json_tricks/nonp.py:221: JsonTricksDeprecation: `json_tricks.load(s)` stripped some comments, but `ignore_comments` was not passed; in the next major release, the behaviour when `ignore_comments` is not passed will change; it is recommended to explicitly pass `ignore_comments=True` if you want to strip comments; see https://github.com/mverleg/pyjson_tricks/issues/74
  warnings.warn('`json_tricks.load(s)` stripped some comments, but `ignore_comments` was '
=> load 2958 samples
Test: [0/124]	Time 5.269 (5.269)	Loss 0.0003 (0.0003)	Accuracy 0.959 (0.959)
Test: [100/124]	Time 0.362 (0.451)	Loss 0.0004 (0.0004)	Accuracy 0.880 (0.914)
| Arch | Head | Shoulder | Elbow | Wrist | Hip | Knee | Ankle | Mean | [email protected] |
|---|---|---|---|---|---|---|---|---|---|
| pose_hrnet | 97.101 | 95.941 | 90.336 | 86.449 | 89.095 | 87.084 | 83.278 | 90.330 | 37.702 |
Segmentation fault (core dumped)

WaiTsun-Yeung avatar Aug 24 '21 11:08 WaiTsun-Yeung

As per the change in title, the Segmentation fault (core dumped) statement was printed after the validate() function has finished running, for both the coco and mpii dataset.

WaiTsun-Yeung avatar Aug 25 '21 01:08 WaiTsun-Yeung

did you solve this problem?

shshin1210 avatar Mar 05 '22 09:03 shshin1210

@WaiTsun-Yeung can you share the dependencies versions that you installed to run the test script please. Thanks

anas-zafar avatar Dec 27 '22 19:12 anas-zafar