onnx-tensorflow icon indicating copy to clipboard operation
onnx-tensorflow copied to clipboard

prepare function ignoring device, always using GPU

Open nnmm opened this issue 7 years ago • 7 comments

Describe the bug

onnx_tf allocates a lot of memory on the GPU (two GPUs, actually) in the prepare function despite the device='CPU' parameter being present. This leads to out of memory crashes. It is also very slow.

To Reproduce Download resnet50 from the model zoo and unpack into /tmp.

import onnx
from onnx_tf.backend import prepare
onnx_model = onnx.load('/tmp/resnet50/model.onnx')
prepare(onnx_model, device='CPU')

The primary GPU now has memory usage of 7705MiB and the secondary of 4573MiB.

Python, ONNX, ONNX-TF, Tensorflow version

This section can be obtained by running get_version.py from util folder.

  • Python version: 3.5.2
  • ONNX version: 1.3.0
  • ONNX-TF version: 1.2.0
  • Tensorflow version: 1.8.0

nnmm avatar Nov 21 '18 10:11 nnmm

Hi, the easiest fix is to make sure onnx-tf doesn't see your GPU's. For instance, when you run the script, do this:

CUDA_VISIBLE_DEVICES= python script.py

tjingrant avatar Nov 21 '18 17:11 tjingrant

I tried that, but curiously, it didn't change anything.

On Wed, Nov 21, 2018, 18:23 Tian Jin <[email protected] wrote:

Hi, the easiest fix is to make sure onnx-tf doesn't see your GPU's. For instance, when you run the script, do this:

CUDA_VISIBLE_DEVICES= python script.py

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/onnx/onnx-tensorflow/issues/330#issuecomment-440747001, or mute the thread https://github.com/notifications/unsubscribe-auth/AB1_xQNBvgqgUIHDYZs5i2LN-OnLeUy5ks5uxYwngaJpZM4Ys33m .

nnmm avatar Nov 21 '18 21:11 nnmm

@nnmm

I tried CUDA_VISIBLE_DEVICES= python inference_test.py

and it works for me.

Another thing to try from this post comment https://github.com/tensorflow/tensorflow/issues/2175#issuecomment-215977173

Run this before running your python command (in the same shell prompt). export CUDA_VISIBLE_DEVICES=

Are you on Windows? I see this related issue (https://github.com/tensorflow/tensorflow/issues/16284).

tjingrant avatar Nov 22 '18 02:11 tjingrant

I'm sorry, the issue was shell-related. I use fish and ran set -x CUDA_VISIBLE_DEVICES instead of set -x CUDA_VISIBLE_DEVICES ''. When I do it correctly, it does not use the GPU. The original issue of not respecting the device parameter still stands, however.

nnmm avatar Nov 26 '18 15:11 nnmm

@nnmm I agree this is a nuisance.

We'll keep this issue as a low priority enhancement request since it does not hurt functionality. Sorry for your inconvenience.

tjingrant avatar Nov 29 '18 05:11 tjingrant

Hello, classic random bump on an old issue, I just happen to run into this problem today.

Any news or maybe pointers to contribute a fix for this ?

IceTDrinker avatar Apr 09 '21 13:04 IceTDrinker

btw workaround from : https://github.com/tensorflow/tensorflow/issues/16284#issuecomment-364832486

does work :

import os
os.environ["CUDA_VISIBLE_DEVICES"] = "-1"

IceTDrinker avatar Apr 09 '21 13:04 IceTDrinker