human-pose-estimation.pytorch icon indicating copy to clipboard operation
human-pose-estimation.pytorch copied to clipboard

the error about tensorboard is happen when i run the train.py

Open do-oo opened this issue 6 years ago • 3 comments

here is log of error:

/home/boyun/anaconda3/envs/tc0.40/bin/python /home/boyun/PycharmProjects/Humanpose/MultiPose/MSRA_BASELINE/human-pose-estimation.pytorch/pose_estimation/train.py => creating output/coco/pose_resnet_101/384x288_d256x3_adam_lr1e-3 Namespace(cfg='experiments/coco/resnet101/384x288_d256x3_adam_lr1e-3.yaml', frequent=5, gpus=None, workers=None) => creating log/coco/pose_resnet_101/384x288_d256x3_adam_lr1e-3_2019-01-24-14-54 {'CUDNN': {'BENCHMARK': True, 'DETERMINISTIC': False, 'ENABLED': True}, 'DATASET': {'DATASET': 'coco', 'DATA_FORMAT': 'jpg', 'FLIP': True, 'HYBRID_JOINTS_TYPE': '', 'ROOT': 'data/coco/', 'ROT_FACTOR': 40, 'SCALE_FACTOR': 0.3, 'SELECT_DATA': False, 'TEST_SET': 'val2017', 'TRAIN_SET': 'train2017'}, 'DATA_DIR': '', 'DEBUG': {'DEBUG': True, 'SAVE_BATCH_IMAGES_GT': True, 'SAVE_BATCH_IMAGES_PRED': True, 'SAVE_HEATMAPS_GT': True, 'SAVE_HEATMAPS_PRED': True}, 'GPUS': '0', 'LOG_DIR': 'log', 'LOSS': {'USE_TARGET_WEIGHT': True}, 'MODEL': {'EXTRA': {'DECONV_WITH_BIAS': False, 'FINAL_CONV_KERNEL': 1, 'HEATMAP_SIZE': array([72, 96]), 'NUM_DECONV_FILTERS': [256, 256, 256], 'NUM_DECONV_KERNELS': [4, 4, 4], 'NUM_DECONV_LAYERS': 3, 'NUM_LAYERS': 101, 'SIGMA': 3, 'TARGET_TYPE': 'gaussian'}, 'IMAGE_SIZE': array([288, 384]), 'INIT_WEIGHTS': True, 'NAME': 'pose_resnet', 'NUM_JOINTS': 17, 'PRETRAINED': 'models/pytorch/imagenet/resnet101-5d3b4d8f.pth', 'STYLE': 'pytorch'}, 'OUTPUT_DIR': 'output', 'PRINT_FREQ': 5, 'TEST': {'BATCH_SIZE': 1, 'BBOX_THRE': 1.0, 'COCO_BBOX_FILE': 'data/coco/person_detection_results/COCO_val2017_detections_AP_H_56_person.json', 'FLIP_TEST': False, 'IMAGE_THRE': 0.0, 'IN_VIS_THRE': 0.2, 'MODEL_FILE': '', 'NMS_THRE': 1.0, 'OKS_THRE': 0.9, 'POST_PROCESS': True, 'SHIFT_HEATMAP': True, 'USE_GT_BBOX': True}, 'TRAIN': {'BATCH_SIZE': 6, 'BEGIN_EPOCH': 0, 'CHECKPOINT': '', 'END_EPOCH': 140, 'GAMMA1': 0.99, 'GAMMA2': 0.0, 'LR': 0.001, 'LR_FACTOR': 0.1, 'LR_STEP': [90, 120], 'MOMENTUM': 0.9, 'NESTEROV': False, 'OPTIMIZER': 'adam', 'RESUME': False, 'SHUFFLE': True, 'WD': 0.0001}, 'WORKERS': 4} => init deconv weights from normal distribution => init 0.weight as normal(0, 0.001) => init 0.bias as 0 => init 1.weight as 1 => init 1.bias as 0 => init 3.weight as normal(0, 0.001) => init 3.bias as 0 => init 4.weight as 1 => init 4.bias as 0 => init 6.weight as normal(0, 0.001) => init 6.bias as 0 => init 7.weight as 1 => init 7.bias as 0 => init final conv weights from normal distribution => init 8.weight as normal(0, 0.001) => init 8.bias as 0 => loading pretrained model models/pytorch/imagenet/resnet101-5d3b4d8f.pth Traceback (most recent call last): File "/home/boyun/PycharmProjects/Humanpose/MultiPose/MSRA_BASELINE/human-pose-estimation.pytorch/pose_estimation/train.py", line 208, in main() File "/home/boyun/PycharmProjects/Humanpose/MultiPose/MSRA_BASELINE/human-pose-estimation.pytorch/pose_estimation/train.py", line 114, in main writer_dict['writer'].add_graph(model, (dump_input, ), verbose=False) File "/home/boyun/anaconda3/envs/tc0.40/lib/python3.6/site-packages/tensorboardX/writer.py", line 566, in add_graph self.file_writer.add_graph(graph(model, input_to_model, verbose)) File "/home/boyun/anaconda3/envs/tc0.40/lib/python3.6/site-packages/tensorboardX/pytorch_graph.py", line 171, in graph from torch.onnx.utils import OperatorExportTypes ImportError: cannot import name 'OperatorExportTypes'

Process finished with exit code 1

when i remove writer_dict['writer'].add_graph(model, (dump_input, ), verbose=False),the train well run. so i guess that the tensorboardX maybe exist some bug. my config is "tensorboardX=1.6". but i find the requirement of this repo tensorboradx only need to bigger 1.2。 what should i do to debug? Any suggestion will be appreciated.

do-oo avatar Jan 24 '19 07:01 do-oo

@Will-Hui It works well when i remove writer_dict as you mentioned.

738654805 avatar Mar 25 '19 09:03 738654805

Just change adjust tensorboadX version, not the last version.

Method one : In pose_estimation/train.py

# NOTE 1.5=>tensorboardX>=1.2
# writer_dict['writer'].add_graph(model, (dump_input, ))
writer_dict['writer'].add_graph(model, (dump_input, ), verbose=False)

Method two: just rewrite tensorbaordX version as follow:

tensorboardX==1.5

HuAndrew avatar Mar 25 '19 19:03 HuAndrew

@HuAndrew Thx.

738654805 avatar Mar 26 '19 01:03 738654805