reid_baseline_with_syncbn icon indicating copy to clipboard operation
reid_baseline_with_syncbn copied to clipboard

多gpu训练问题?

Open frensher opened this issue 2 years ago • 0 comments

你好 多卡运行到这就卡着不动了 单卡的脚本没问题 2022-06-19 19:09:40,134 reid_baseline.train INFO: Trainer Built

我只修改了这个

MODEL: PRETRAIN_PATH: '/home/wgj233/.cache/torch/checkpoints/resnet50-19c8e357.pth'

INPUT: SIZE_TRAIN: [384, 384] SIZE_TEST: [384, 384] PIXEL_MEAN: [0.5, 0.5, 0.5] PIXEL_STD: [0.5, 0.5, 0.5] PROB: 0.5 # random horizontal flip RE_PROB: 0.5 # random erasing PADDING: 0

DATASETS: NAMES: 'FVRID_sum' # 'market1501' DATA_PATH: '/home/wgj233/Datasets/FVRID_sum' # '#/home/zbc/data/market1501' TRAIN_PATH: 'train_foggy' # 'bounding_box_train' QUERY_PATH: 'query_foggy' # 'query' GALLERY_PATH: 'gallery_foggy' # 'bounding_box_test'

DATALOADER: SAMPLER: 'softmax_triplet' NUM_INSTANCE: 8 NUM_WORKERS: 4

SOLVER: OPTIMIZER_NAME: 'Adam' MAX_EPOCHS: 30 BASE_LR: 0.0001 BIAS_LR_FACTOR: 1 WEIGHT_DECAY: 0.0005 WEIGHT_DECAY_BIAS: 0.0005 IMS_PER_BATCH: 16

STEPS: [20, 30, 45, 60, 75, 90, 105, 120, 135, 150, 165, 180, 195, 210, 225, 240, 255] GAMMA: 0.6

WARMUP_FACTOR: 0.01 WARMUP_ITERS: 10 WARMUP_METHOD: 'linear'

CHECKPOINT_PERIOD: 1 LOG_PERIOD: 100 EVAL_PERIOD: 1

TEST: IMS_PER_BATCH: 16 DEBUG: True WEIGHT: "path" MULTI_GPU: True

OUTPUT_DIR: "/home/wgj233/reid_baseline_with_syncbn-master/outputs/debug_multi-gpu"

frensher avatar Jun 19 '22 11:06 frensher