Scene-Graph-Benchmark.pytorch Error when installing maskrcnn

Some common problems & solutions when installing maskrcnn_benchmark.

1. THC.h: No such file or directory/THCeilDiv Undefined/ see this

2. identifier "THCudaCheck" is undefined see this

3. torch.utils.cpp_extension.load stuck see this

Apr 03 '24 05:04 draym28

Hi,

The version of the code in this repo is very outdated and is indeed not up-to-date with current CUDA standards. I fixed all of those issues in my implementation, you can probably copy the csrc folder into your local path and be able to compile without any issues (I tested it with CUDA version 11+): https://github.com/Maelic/SGG-Benchmark/tree/main/sgg_benchmark/csrc

Best

Apr 03 '24 13:04 Maelic

Hi,

The version of the code in this repo is very outdated and is indeed not up-to-date with current CUDA standards. I fixed all of those issues in my implementation, you can probably copy the csrc folder into your local path and be able to compile without any issues (I tested it with CUDA version 11+): https://github.com/Maelic/SGG-Benchmark/tree/main/sgg_benchmark/csrc

Best

Thanks for your help! But after using your csrc, when I conduct SGDet on Custom Images following the instruction in README.md, other errors still comes up:

D:\App\Anaconda3\envs\sgg\lib\site-packages\torch\utils\cpp_extension.py:358: UserWarning: Error checking compiler version for cl: 'cp1' codec can't decode bytes in position 0--1: No mapping for the Unicode character exists in the target code page.
  warnings.warn(f'Error checking compiler version for {compiler}: {error}')
D:\App\Anaconda3\envs\sgg\lib\site-packages\apex\__init__.py:68: DeprecatedFeatureWarning: apex.amp is deprecated and will be removed by the end of February 2023. Use [PyTorch AMP](https://pytorch.org/docs/stable/amp.html)
  warnings.warn(msg, DeprecatedFeatureWarning)
Traceback (most recent call last):
  File "tools/relation_test_net.py", line 11, in <module>
    from maskrcnn_benchmark.data import make_data_loader
  File "d:\code\new_proj\v2t\sgg\scenegraphbenchmark\maskrcnn_benchmark\data\__init__.py", line 2, in <module>
    from .build import make_data_loader, get_dataset_statistics
  File "d:\code\new_proj\v2t\sgg\scenegraphbenchmark\maskrcnn_benchmark\data\build.py", line 14, in <module>
    from . import datasets as D
  File "d:\code\new_proj\v2t\sgg\scenegraphbenchmark\maskrcnn_benchmark\data\datasets\__init__.py", line 2, in <module>
    from .coco import COCODataset
  File "d:\code\new_proj\v2t\sgg\scenegraphbenchmark\maskrcnn_benchmark\data\datasets\coco.py", line 39, in <module>
    class COCODataset(torchvision.datasets.coco.CocoDetection):
AttributeError: module 'torchvision' has no attribute 'datasets'

I still stuck on this step. It makes me crazy.

Apr 04 '24 11:04 draym28

Which version of torchvision are you using?

Apr 04 '24 12:04 Maelic

It works for me with torchvision 0.17 for cuda 12.1

Apr 04 '24 13:04 Maelic

I am using pytorch=1.13 and torchvision=0.14. I can import torchvision.datasets as you did, but when I run the scripts to conduct sgdet on custom images, the error came up. it is confused.

Apr 04 '24 13:04 draym28

Then you may be running your code in another conda env or something like that. You can also try to clean and re-build the package with something like rm -rf ./build/ && python setup.py build develop

Apr 04 '24 13:04 Maelic

I clean and create a new env many times. But the error still come up. And I also did python setup.py build develop every time. Many people also have this problem, see this.

Apr 04 '24 13:04 draym28

Can you post the outputs of pip freeze | grep torchvision and conda list | grep torchvision ? You may have different versions of torchvision installed at the same time.

Apr 04 '24 13:04 Maelic

outputs of pip freeze | grep torchvision: torchvision==0.14.1 outputs of conda list | grep torchvision: torchvision 0.14.1 py38_cu117 pytorch

Apr 04 '24 13:04 draym28

Hum I don't know, from your outputs I assume that you installed torchvision with conda, try removing it and install with pip maybe. On my machine, I installed it with the following command (for cuda 12.1): pip install torch==2.2.1 torchvision==0.17.1 torchaudio==2.2.1 --index-url https://download.pytorch.org/whl/cu121

Apr 04 '24 13:04 Maelic

Still don't work. This time I create a new env and use pip install torch==1.13.1 torchvision==0.14.1 torchaudio==0.13.1 --index-url https://download.pytorch.org/whl/cu117. But the error still come up.

Apr 04 '24 15:04 draym28

I'm afraid I can't help you more here, sorry. I don't recall having this error ever, even when I was working with previous versions of pytorch for this codebase.

Apr 04 '24 15:04 Maelic

It is OK, thanks for your help. I will keep finding the solution.

Apr 05 '24 00:04 draym28

Hi @Maelic, thank you for sharing your implementation. I'm encountering an issue with installing Apex due to CUDA compatibility. I was wondering if you could provide guidance on how to resolve this. Thanks!

Apr 15 '24 17:04 Ali-Hatami

Hi @Maelic, thank you for sharing your implementation. I'm encountering an issue with installing Apex due to CUDA compatibility. I was wondering if you could provide guidance on how to resolve this. Thanks!

You don't need to use APEX anymore as it is depreciated and built-in for new versions of torch. Please consider removing all reference to apex and this line https://github.com/KaihuaTang/Scene-Graph-Benchmark.pytorch/blob/4b6b71a90d4198d9dae574d42b062a5e534da291/tools/relation_train_net.py#L159

And add this a little above:

with torch.autocast(device_type='cuda', dtype=torch.float16, enabled=use_amp):
            loss_dict = model(images, targets)
            
            losses = sum(loss for loss in loss_dict.values())

And it should work, see:

https://github.com/Maelic/SGG-Benchmark/blob/cecf1bbe46f3d862704d9cf0ffccf2282fb00cfe/tools/relation_train_net.py#L51

Apr 15 '24 17:04 Maelic

Thank you for the prompt response. In the step-by-step installation (https://github.com/Maelic/SGG-Benchmark/blob/main/INSTALL.md) I have an error. My CUDA version is 11.5 but 11.5 is not available in the nvidia channels. How can I solve this issue?

RuntimeError: The detected CUDA version (11.5) mismatches the version that was used to compile PyTorch (12.1). Please make sure to use the same CUDA versions.

Apr 15 '24 18:04 Ali-Hatami

Thank you for the prompt response. In the step-by-step installation (https://github.com/Maelic/SGG-Benchmark/blob/main/INSTALL.md) I have an error. My CUDA version is 11.5 but 11.5 is not available in the nvidia channels. How can I solve this issue?

RuntimeError: The detected CUDA version (11.5) mismatches the version that was used to compile PyTorch (12.1). Please make sure to use the same CUDA versions.

Try upgrading your CUDA version or build torch from source. By the way, this is not an issue directly related to this work, you will probably have more success if you ask on the dedicated PyTorch forum.

Apr 16 '24 08:04 Maelic

Scene-Graph-Benchmark.pytorch
Scene-Graph-Benchmark.pytorch copied to clipboard

Error when installing maskrcnn_benchmark

Some common problems & solutions when installing maskrcnn_benchmark.

Scene-Graph-Benchmark.pytorch Scene-Graph-Benchmark.pytorch copied to clipboard

Error when installing maskrcnn_benchmark

Some common problems & solutions when installing maskrcnn_benchmark.

Scene-Graph-Benchmark.pytorch
Scene-Graph-Benchmark.pytorch copied to clipboard