aanet CUDA11

Hi, is there any way to make this project work with CUDA11 ?

thx

Mar 10 '21 12:03 trigal

Hi, I haven't tested with CUDA11. I would recommend you to have a try and see what happens.

Mar 18 '21 07:03 haofeixu

I tried it on a DGX A100 machine with A100-SXM4-40GB GPUs , using the nvidia docker nvcr.io/nvidia/pytorch:19.10-py3 that should meet the requirements you put in the description, but the problem is that as far I understand these GPUs are not compatible with CUDA10 drivers.

Trying to run the network on updated configurations with CUDA11 the system hangs at https://github.com/haofeixu/aanet/blob/master/predict.py#L87 with the 'to(device)', so I suspect something wrong with the model or, more likely, with the deform_conv package.

Mar 18 '21 09:03 trigal

Have you successfully compiled the deform_conv package?

Mar 18 '21 09:03 haofeixu

I'm pretty certain it compiled without errors, but I'll try again next days to report here the compiler output.

Mar 23 '21 09:03 trigal

My GPU's driver is not compatible with CUDA10 just compatible with CUDA11.0，can you succeed with CUDA11.0 for deformable_conv building?

Mar 24 '21 13:03 zyl1336110861

I just compiled the deformable_conv module with CUDA11.1, pytorch 1.7.0, python3.7.4, gcc5.5. I encountered the bug firstly with "AT_CHECK is not declared in this scope", so I just change all "AT_CHECK" to "TORCH_CHECK" in the cpp src files according to #11 . This error information is in the middle of the output information of the compile process so be carefule for that information.

Mar 27 '21 09:03 zyl1336110861

@haofeixu

Mar 27 '21 09:03 zyl1336110861

Thanks @zyl1336110861 for sharing your solution! Hope it can be helpful for others!

Apr 03 '21 17:04 haofeixu

I can run successful in single gpu, but when I use multi-gpus, the process will be hang, my cuda version is 11.3, pytorch 1.9.0, python3.8, is there any way fix that? @haofeixu @ all

Feb 08 '22 11:02 q5390498

How did you solve it?I didn't find a description for #11.

May 13 '22 06:05 llllooorange

Hi all, sorry for the late response.

If this issue is still relavant to you, I would suggest to try our new GMStereo model: https://haofeixu.github.io/unimatch/ & https://github.com/autonomousvision/unimatch. No CUDA op is required. A Colab demo is also provided to try our model in your browser. Hope it helps, thanks.

Nov 13 '22 04:11 haofeixu

aanet aanet copied to clipboard

CUDA11

aanet
aanet copied to clipboard