CWT-for-FSS icon indicating copy to clipboard operation
CWT-for-FSS copied to clipboard

Difference between transformer_resnet50 and pspnet_resnet50

Open bach05 opened this issue 2 years ago • 1 comments

Hi, I am trying to reproduce the results in your paper. By the way, I encounter some difficulties.

I I downloaded coco dataset and then indicate the folder on data_root in coco.yaml. I tried to launch a training, ending up with a transformer_resnet50 model. Testing results was poor, like 0.13 in mIOU with 5 shots, s0.

I downloaded the pre-trained modeld. The folder contains pspnet_resnet50 models. I changed the resume_weights in coco.yaml. If I try to run the test with these models, I got the error:

sh scripts/test.sh coco 5 [2] 50 0
==> Running DDP checkpoint example on rank 0.
=> no weight found at '/home/bacchin/CWT/CWT_venv/CWT-for-FSS/model_ckpt/coco/split=0/model/shot_5/transformer_resnet50/best.pth'
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/bacchin/CWT/CWT_venv/CWT-for-FSS/src/test.py", line 302, in <module>
    mp.spawn(main_worker, args=(world_size, args), nprocs=world_size, join=True)
  File "/home/bacchin/CWT/CWT_venv/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/home/bacchin/CWT/CWT_venv/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
    while not context.join():
  File "/home/bacchin/CWT/CWT_venv/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 150, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException: 

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/home/bacchin/CWT/CWT_venv/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/home/bacchin/CWT/CWT_venv/CWT-for-FSS/src/test.py", line 93, in main_worker
    assert os.path.isfile(filepath), filepath
AssertionError: model_ckpt/coco/split=0/model/shot_5/transformer_resnet50/best.pth

I understood that model_dir in coco.yaml must point to a transformer_resnet50 model. I tried to put in model_dir, the transformer_resnet50 obtained with the training. It worked, but results are still under the performance declared in the paper (like 0.3 with 5 shots, s0)

transformer_resnet50 are not delivered with pre-trained files. Why? And why we have to pointers to models, namely model_dir and resume_weights?

Am I missing something?

Thank you for your help!

bach05 avatar Apr 14 '22 12:04 bach05

Hi, I am trying to reproduce the results in your paper. By the way, I encounter some difficulties.

I I downloaded coco dataset and then indicate the folder on data_root in coco.yaml. I tried to launch a training, ending up with a transformer_resnet50 model. Testing results was poor, like 0.13 in mIOU with 5 shots, s0.

I downloaded the pre-trained modeld. The folder contains pspnet_resnet50 models. I changed the resume_weights in coco.yaml. If I try to run the test with these models, I got the error:

sh scripts/test.sh coco 5 [2] 50 0
==> Running DDP checkpoint example on rank 0.
=> no weight found at '/home/bacchin/CWT/CWT_venv/CWT-for-FSS/model_ckpt/coco/split=0/model/shot_5/transformer_resnet50/best.pth'
Traceback (most recent call last):
  File "/usr/lib/python3.6/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/lib/python3.6/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/home/bacchin/CWT/CWT_venv/CWT-for-FSS/src/test.py", line 302, in <module>
    mp.spawn(main_worker, args=(world_size, args), nprocs=world_size, join=True)
  File "/home/bacchin/CWT/CWT_venv/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 230, in spawn
    return start_processes(fn, args, nprocs, join, daemon, start_method='spawn')
  File "/home/bacchin/CWT/CWT_venv/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 188, in start_processes
    while not context.join():
  File "/home/bacchin/CWT/CWT_venv/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 150, in join
    raise ProcessRaisedException(msg, error_index, failed_process.pid)
torch.multiprocessing.spawn.ProcessRaisedException: 

-- Process 0 terminated with the following error:
Traceback (most recent call last):
  File "/home/bacchin/CWT/CWT_venv/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 59, in _wrap
    fn(i, *args)
  File "/home/bacchin/CWT/CWT_venv/CWT-for-FSS/src/test.py", line 93, in main_worker
    assert os.path.isfile(filepath), filepath
AssertionError: model_ckpt/coco/split=0/model/shot_5/transformer_resnet50/best.pth

I understood that model_dir in coco.yaml must point to a transformer_resnet50 model. I tried to put in model_dir, the transformer_resnet50 obtained with the training. It worked, but results are still under the performance declared in the paper (like 0.3 with 5 shots, s0)

transformer_resnet50 are not delivered with pre-trained files. Why? And why we have to pointers to models, namely model_dir and resume_weights?

Am I missing something?

Thank you for your help!

Hi Have you solve the above issue? Can you please help me?

7M7L avatar Sep 15 '22 04:09 7M7L

One is backbone and one is for classifier weight transformer.

zhiheLu avatar Mar 30 '24 09:03 zhiheLu