RuntimeError: cudaErrorUnknown(999): unknown error

Open WANGRUI-ZB opened this issue 1 year ago • 9 comments

I run according to the readme documentation this step:Running 3DGS+T runevc -t gui -c configs/exps/gaussiant/gaussiant_${expname}.yaml,configs/specs/superm.yaml encounter RuntimeError: cudaErrorUnknown(999): unknown error image Please help me.

WANGRUI-ZB avatar Dec 14 '23 09:12 WANGRUI-ZB

Hi @WANGRUI-ZB sorry for the late reply. What PyTorch and CUDA version are you using? I'll try to reproduce. It's best if you could paste the output of pip list here for me to try to recreate the environment.

Moreover, could you please also check if the display variable is set in your terminal? echo $DISPLAY

dendenxu avatar Dec 17 '23 11:12 dendenxu

Hi @dendenxu thank you for your reply. My PyTorch version is 2.1.1 CUDA version is 12.3. and I check the display variable with echo $DISPLAY print :0 The following is the result of pip list:

WANGRUI-ZB avatar Dec 18 '23 08:12 WANGRUI-ZB

Hi, looks like you're using the most cutting-edge CUDA and nvidia driver, which we haven't tested. I'll try to setup a test environment for testing and get back to you later.

In the meantime, you could try creating a new environment with slighter lower CUDA version (we've tested 12.1 and 11.8) and matched pytorch installation.

dendenxu avatar Dec 18 '23 08:12 dendenxu

Hi, can you tell me the recommended cuda and pytorch version? before I tried cuda 11.8 but report the same error!

WANGRUI-ZB avatar Dec 18 '23 08:12 WANGRUI-ZB

PyTorch 1.12.1 -> 2.0.1 and CUDA 11.8 -> 12.1 have been tested on Ubuntu and Windows. Also, is your testing environment a double-GPU laptop? I suspect the error could be due to mismatches between the OpenGL's context GPU and CUDA's GPU.

dendenxu avatar Dec 18 '23 09:12 dendenxu

I suspect the error could be due to mismatches between the OpenGL's context GPU and CUDA's GPU.

If that's the case, you could try disabling the integrated graphics card in BIOS. For now, easyvolcap only supports Nvidia GPUs. But I think it shouldn't be too hard to skip creating OpenGL context on integrated intel graphics cards in the future.

dendenxu avatar Dec 18 '23 09:12 dendenxu

Okay, I'll try to close it and run

WANGRUI-ZB avatar Dec 18 '23 09:12 WANGRUI-ZB

Hi @WANGRUI-ZB. Did the provided solution solve your issue? Do you still face the same exception after disabling the integrated graphics card? I'll be happy to provide more assistance if the issue persists : ]

dendenxu avatar Dec 24 '23 05:12 dendenxu

I have been busy with other things at this time, so I have not yet operated. I will give you the feedback when I get the result

WANGRUI-ZB avatar Dec 28 '23 08:12 WANGRUI-ZB

I've verified that it works with the integrated graphics card disabled

WANGRUI-ZB avatar Jan 10 '24 06:01 WANGRUI-ZB

@WANGRUI-ZB Glad to know! Closing the issue as completed for now. Will reopen if after we added integrated graphics support.

dendenxu avatar Jan 10 '24 07:01 dendenxu

hi, @dendenxu i have also meet similar problem ,and my env is this, what should i do to solve this? image

Linkersem avatar May 20 '24 10:05 Linkersem

Hi, sorry for the late reply. It seems like this is an issue with the downstream code repo LoG, maybe we should continue the discussion there. Looks like the issue originates from a wrongly installed LoG environment for gaussian rasterization.

dendenxu avatar May 22 '24 11:05 dendenxu