tf-raft About GPU memory consumption.

About GPU memory consumption.

Open AlbertHuyb opened this issue 2 years ago • 4 comments

When training with python train_chairs.py configs/train_chairs.yml, I noticed that batch_size=4 exceeds the memory limitation of a single 2080ti GPU. Instead, I can only set batch_size to 1 on a single 2080ti GPU, which consumes more than 10GB GPU memory.

I use tensorflow=2.3.0, because I noticed that 2.8.0 is not supported by Tensorflow Addons.

Tensorflow Addons supports using Python ops for all Tensorflow versions above or equal to 2.2.0 and strictly below 2.4.0 (nightly versions are not supported). The versions of TensorFlow you are currently using is 2.8.0 and is not supported.

May 10 '22 03:05 AlbertHuyb

There's probably a memory leak somewhere...

May 10 '22 14:05 adeeb10abbas

There's probably a memory leak somewhere...

Could you please share your environment and GPU memory consumption? I'm really new to Tensorflow2 and feel quite confused.

Thanks for your help!

May 10 '22 14:05 AlbertHuyb

After I set os.environ["TF_FORCE_GPU_ALLOW_GROWTH"] = "true", batch_size=1 occupies 10996MiB and batch_size=2 occupies 10996MiB, while batch_size=3 returns OOM.

I'm using python 3.7.13, tensorflow 2.4.0, cudatoolkit 11.0 on ubuntu 18.04

May 10 '22 15:05 AlbertHuyb

This fixxed it for me: https://github.com/daigo0927/tf-raft/pull/27

Jun 28 '22 16:06 giulionf

tf-raft tf-raft copied to clipboard

About GPU memory consumption.

tf-raft
tf-raft copied to clipboard