ObstructionRemoval
ObstructionRemoval copied to clipboard
How much VRAM is needed?
I created a collab to use this project with the Googles GPU (I think it comes with a Tesla T4 ~ 16GB VRAM) and I still get OORM errors, any ideas or is it just not possible to run this on a GPU?
https://colab.research.google.com/drive/14zi-3LVOhp6_DQZPl0tW-7H4U3RDUNIK#scrollTo=Sr6EibfN97Xo
Same here. I tested test_reflection.py
with an NVIDIA V100 16GB GPU, and I changed nn_opts['gpu_devices'] = ['/device:CPU:0'], nn_opts['controller'] = '/device:CPU:0'
to nn_opts['gpu_devices'] = ['/device:GPU:0'], nn_opts['controller'] = '/device:GPU:0'
. Also I overwrote os.environ["CUDA_VISIBLE_DEVICES"]
to '0'
. This resulted in a OOM issue.
Below is my error message.
Limit: 68719476736
InUse: 9660434944
MaxInUse: 9660434944
NumAllocs: 2676
MaxAllocSize: 668467200
2021-04-05 16:05:34.480915: W tensorflow/core/common_runtime/bfc_allocator.cc:279] ******************_************************************__**************************__*******___*_***
2021-04-05 16:05:34.480937: W tensorflow/core/framework/op_kernel.cc:1275] OP_REQUIRES failed at gpu_swapping_kernels.cc:43 : Resource exhausted: OOM when allocating tensor with shape[40,32,272,480] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator cuda_host_bfc
2021-04-05 16:05:34.558797: E tensorflow/stream_executor/cuda/cuda_driver.cc:965] failed to alloc 17179869184 bytes on host: CUDA_ERROR_INVALID_VALUE
2021-04-05 16:05:34.558835: W ./tensorflow/core/common_runtime/gpu/cuda_host_allocator.h:40] could not allocate pinned host memory of size: 17179869184
2021-04-05 16:05:34.636627: E tensorflow/stream_executor/cuda/cuda_driver.cc:965] failed to alloc 17179869184 bytes on host: CUDA_ERROR_INVALID_VALUE
2021-04-05 16:05:34.636665: W ./tensorflow/core/common_runtime/gpu/cuda_host_allocator.h:40] could not allocate pinned host memory of size: 17179869184
I guess it's because the size of test image is a little large (1920 x 1080). I know we can still get results by running code on CPUs, but it's really slow. Is there a way to accelerate the testing? Any helps would be appreciated!