Long-Range-Grouping-Transformer icon indicating copy to clipboard operation
Long-Range-Grouping-Transformer copied to clipboard

How can we create a demo file to test trained models and visualize results?

Open Chu-Orion opened this issue 1 year ago • 3 comments

Greetings!!! I'm sorry to bother you again. Now that the model has been trained, I would like to ask for advice on how to test and generate a 3D voxel model using your own images. I will be grateful if model visualization script is collected in a test file.

Chu-Orion avatar Jan 10 '24 02:01 Chu-Orion

You can add config --test in train.sh for testing. For visualization, please refer Pix2Vox

LiyingCV avatar Jan 11 '24 07:01 LiyingCV

Thank you very much for your reply, following your instructions, I referred to Pix2Vox and got the following visualization script showed as below: `

encoder = Encoder(cfg)
decoder = Decoder(cfg)
merger = Merger(cfg)

# cfg.CONST.WEIGHTS = '/home/caig/Desktop/pix2vox/Pix2Vox/pretrain/Pix2Vox-A-ShapeNet.pth'
cfg.CONST.WEIGHTS = './pths/UMIFormer.pth'
checkpoint = torch.load(cfg.CONST.WEIGHTS, map_location=torch.device('cpu'))

fix_checkpoint = {}
fix_checkpoint['encoder_state_dict'] = OrderedDict(
    (k.split('module.')[1:][0], v) for k, v in checkpoint['encoder_state_dict'].items())
fix_checkpoint['decoder_state_dict'] = OrderedDict(
    (k.split('module.')[1:][0], v) for k, v in checkpoint['decoder_state_dict'].items())
fix_checkpoint['merger_state_dict'] = OrderedDict(
    (k.split('module.')[1:][0], v) for k, v in checkpoint['merger_state_dict'].items())
# fix_checkpoint['refiner_state_dict'] = OrderedDict(
#     (k.split('module.')[1:][0], v) for k, v in checkpoint['refiner_state_dict'].items())

epoch_idx = checkpoint['epoch_idx']
encoder.load_state_dict(fix_checkpoint['encoder_state_dict'])
decoder.load_state_dict(fix_checkpoint['decoder_state_dict'])
merger.load_state_dict(fix_checkpoint['merger_state_dict'])
# if cfg.NETWORK.USE_REFINER:
#     print('Use refiner')
#     refiner.load_state_dict(fix_checkpoint['refiner_state_dict'])
# if cfg.NETWORK.USE_MERGER:
#     print('Use merger')
#     merger.load_state_dict(fix_checkpoint['merger_state_dict'])

encoder.eval()
decoder.eval()
merger.eval()
# refiner.eval()


# img1_path = './datasets/ShapeNet/ShapeNetRendering/03211117/1a92363c2a155ed3c397356311cbeea4/rendering/17.png'
img1_path = './datasets/Pix3D/img/chair/0016.png'
img1_np = cv2.imread(img1_path, cv2.IMREAD_UNCHANGED).astype(np.float32) / 255.

sample = np.array([img1_np])

IMG_SIZE = cfg.CONST.IMG_H, cfg.CONST.IMG_W
CROP_SIZE = cfg.CONST.CROP_IMG_H, cfg.CONST.CROP_IMG_W

test_transforms = utils.data_transforms.Compose([
    utils.data_transforms.CenterCrop(IMG_SIZE, CROP_SIZE),
    utils.data_transforms.RandomBackground(cfg.TEST.RANDOM_BG_COLOR_RANGE),
    utils.data_transforms.Normalize(mean=cfg.DATASET.MEAN, std=cfg.DATASET.STD),
    utils.data_transforms.ToTensor(),
])

rendering_images = test_transforms(rendering_images=sample)
rendering_images = rendering_images.unsqueeze(0)

with torch.no_grad():
    image_features = encoder(rendering_images)
    raw_features, generated_volume = decoder(image_features)
    # if cfg.NETWORK.USE_MERGER and epoch_idx >= cfg.TRAIN.EPOCH_START_USE_MERGER:
    #     generated_volume = merger(raw_features, generated_volume)
    # else:
    #     generated_volume = torch.mean(generated_volume, dim=1)
    # if cfg.NETWORK.USE_REFINER and epoch_idx >= cfg.TRAIN.EPOCH_START_USE_REFINER:
    #     generated_volume = refiner(generated_volume)
    generated_volume = merger(raw_features, generated_volume)


generated_volume = generated_volume.squeeze(0)

img_dir = './output/TestResult'
gv = generated_volume.cpu().numpy()
gv_new = np.swapaxes(gv, 2, 1)
rendering_views = utils.binvox_visualization.get_volume_views(gv_new, os.path.join(img_dir),
                                                              epoch_idx)`

I am using this script in the runner.py `

# Start train/test process
if not args.test and not args.batch_test:
    train_net(cfg)
else:
    if 'WEIGHTS' in cfg.CONST and os.path.exists(cfg.CONST.WEIGHTS):
        if args.test:
            test_net(test_single)
        elif args.batch_test:
            batch_test(cfg)
    else:
        logging.error('Please specify the file path of checkpoint.')
        sys.exit(2)

` however, there is still something wrong when I try to generate the voxel model, and the Traceback info

_warnings.warn(
Traceback (most recent call last):
  File "runner.py", line 20, in <module>
    from core.test_single_image import test_single
  File "/root/UMIFormer/core/test_single_image.py", line 16, in <module>
    class test_single():
  File "/root/UMIFormer/core/test_single_image.py", line 18, in test_single
    encoder = Encoder(cfg)
  File "/root/UMIFormer/models/encoder/encoder_vit_ivdb.py", line 21, in __init__
    self.encoder = self.create_model(
  File "/root/UMIFormer/models/encoder/encoder_vit_ivdb.py", line 57, in create_model
    return self._create_vision_transformer(
  File "/root/UMIFormer/models/encoder/encoder_vit_ivdb.py", line 43, in _create_vision_transformer
    model = self.decoupling_type(
  File "/root/UMIFormer/models/encoder/encoder_vit_ivdb.py", line 200, in __init__
    if torch.distributed.get_rank() == 0:
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 822, in get_rank
    default_pg = _get_default_group()
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 410, in _get_default_group
    raise RuntimeError(
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 13409) of binary: /root/miniconda3/bin/python
Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/miniconda3/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in <module>
    main()
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
    launch(args)
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
    run(args)
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/run.py", line 710, in run
    elastic_launch(
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 259, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
runner.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-01-13_12:59:56
  host      : autodl-container-d82911a03c-d427ecf0
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 13409)
  error_file: <N/A>_

so, what should I do to fix this? Thank you very much!!! (I'm really sorry for my weak coding skills (༎ຶ ෴ ༎ຶ))

Chu-Orion avatar Jan 13 '24 05:01 Chu-Orion

Thank you very much for your reply, following your instructions, I referred to Pix2Vox and got the following visualization script showed as below: `

encoder = Encoder(cfg)
decoder = Decoder(cfg)
merger = Merger(cfg)

# cfg.CONST.WEIGHTS = '/home/caig/Desktop/pix2vox/Pix2Vox/pretrain/Pix2Vox-A-ShapeNet.pth'
cfg.CONST.WEIGHTS = './pths/UMIFormer.pth'
checkpoint = torch.load(cfg.CONST.WEIGHTS, map_location=torch.device('cpu'))

fix_checkpoint = {}
fix_checkpoint['encoder_state_dict'] = OrderedDict(
    (k.split('module.')[1:][0], v) for k, v in checkpoint['encoder_state_dict'].items())
fix_checkpoint['decoder_state_dict'] = OrderedDict(
    (k.split('module.')[1:][0], v) for k, v in checkpoint['decoder_state_dict'].items())
fix_checkpoint['merger_state_dict'] = OrderedDict(
    (k.split('module.')[1:][0], v) for k, v in checkpoint['merger_state_dict'].items())
# fix_checkpoint['refiner_state_dict'] = OrderedDict(
#     (k.split('module.')[1:][0], v) for k, v in checkpoint['refiner_state_dict'].items())

epoch_idx = checkpoint['epoch_idx']
encoder.load_state_dict(fix_checkpoint['encoder_state_dict'])
decoder.load_state_dict(fix_checkpoint['decoder_state_dict'])
merger.load_state_dict(fix_checkpoint['merger_state_dict'])
# if cfg.NETWORK.USE_REFINER:
#     print('Use refiner')
#     refiner.load_state_dict(fix_checkpoint['refiner_state_dict'])
# if cfg.NETWORK.USE_MERGER:
#     print('Use merger')
#     merger.load_state_dict(fix_checkpoint['merger_state_dict'])

encoder.eval()
decoder.eval()
merger.eval()
# refiner.eval()


# img1_path = './datasets/ShapeNet/ShapeNetRendering/03211117/1a92363c2a155ed3c397356311cbeea4/rendering/17.png'
img1_path = './datasets/Pix3D/img/chair/0016.png'
img1_np = cv2.imread(img1_path, cv2.IMREAD_UNCHANGED).astype(np.float32) / 255.

sample = np.array([img1_np])

IMG_SIZE = cfg.CONST.IMG_H, cfg.CONST.IMG_W
CROP_SIZE = cfg.CONST.CROP_IMG_H, cfg.CONST.CROP_IMG_W

test_transforms = utils.data_transforms.Compose([
    utils.data_transforms.CenterCrop(IMG_SIZE, CROP_SIZE),
    utils.data_transforms.RandomBackground(cfg.TEST.RANDOM_BG_COLOR_RANGE),
    utils.data_transforms.Normalize(mean=cfg.DATASET.MEAN, std=cfg.DATASET.STD),
    utils.data_transforms.ToTensor(),
])

rendering_images = test_transforms(rendering_images=sample)
rendering_images = rendering_images.unsqueeze(0)

with torch.no_grad():
    image_features = encoder(rendering_images)
    raw_features, generated_volume = decoder(image_features)
    # if cfg.NETWORK.USE_MERGER and epoch_idx >= cfg.TRAIN.EPOCH_START_USE_MERGER:
    #     generated_volume = merger(raw_features, generated_volume)
    # else:
    #     generated_volume = torch.mean(generated_volume, dim=1)
    # if cfg.NETWORK.USE_REFINER and epoch_idx >= cfg.TRAIN.EPOCH_START_USE_REFINER:
    #     generated_volume = refiner(generated_volume)
    generated_volume = merger(raw_features, generated_volume)


generated_volume = generated_volume.squeeze(0)

img_dir = './output/TestResult'
gv = generated_volume.cpu().numpy()
gv_new = np.swapaxes(gv, 2, 1)
rendering_views = utils.binvox_visualization.get_volume_views(gv_new, os.path.join(img_dir),
                                                              epoch_idx)`

I am using this script in the runner.py `

# Start train/test process
if not args.test and not args.batch_test:
    train_net(cfg)
else:
    if 'WEIGHTS' in cfg.CONST and os.path.exists(cfg.CONST.WEIGHTS):
        if args.test:
            test_net(test_single)
        elif args.batch_test:
            batch_test(cfg)
    else:
        logging.error('Please specify the file path of checkpoint.')
        sys.exit(2)

` however, there is still something wrong when I try to generate the voxel model, and the Traceback info

_warnings.warn(
Traceback (most recent call last):
  File "runner.py", line 20, in <module>
    from core.test_single_image import test_single
  File "/root/UMIFormer/core/test_single_image.py", line 16, in <module>
    class test_single():
  File "/root/UMIFormer/core/test_single_image.py", line 18, in test_single
    encoder = Encoder(cfg)
  File "/root/UMIFormer/models/encoder/encoder_vit_ivdb.py", line 21, in __init__
    self.encoder = self.create_model(
  File "/root/UMIFormer/models/encoder/encoder_vit_ivdb.py", line 57, in create_model
    return self._create_vision_transformer(
  File "/root/UMIFormer/models/encoder/encoder_vit_ivdb.py", line 43, in _create_vision_transformer
    model = self.decoupling_type(
  File "/root/UMIFormer/models/encoder/encoder_vit_ivdb.py", line 200, in __init__
    if torch.distributed.get_rank() == 0:
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 822, in get_rank
    default_pg = _get_default_group()
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/distributed_c10d.py", line 410, in _get_default_group
    raise RuntimeError(
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 13409) of binary: /root/miniconda3/bin/python
Traceback (most recent call last):
  File "/root/miniconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/root/miniconda3/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in <module>
    main()
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
    launch(args)
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
    run(args)
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/run.py", line 710, in run
    elastic_launch(
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in __call__
    return launch_agent(self._config, self._entrypoint, list(args))
  File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 259, in launch_agent
    raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError: 
============================================================
runner.py FAILED
------------------------------------------------------------
Failures:
  <NO_OTHER_FAILURES>
------------------------------------------------------------
Root Cause (first observed failure):
[0]:
  time      : 2024-01-13_12:59:56
  host      : autodl-container-d82911a03c-d427ecf0
  rank      : 0 (local_rank: 0)
  exitcode  : 1 (pid: 13409)
  error_file: <N/A>_

so, what should I do to fix this? Thank you very much!!! (I'm really sorry for my weak coding skills (༎ຶ ෴ ༎ຶ))

It seems like you made some mistakes about the initialization of Encoder. Please check your code.

LiyingCV avatar Jan 15 '24 02:01 LiyingCV