tensorrt UnavailableError: Can't provision more than one single cluster at a time

my code:

FP32_SAVED_MODEL_DIR = SAVED_MODEL_DIR+"_TFTRT_FP32/1"
!rm -rf $FP32_SAVED_MODEL_DIR
#Now we create the TFTRT FP32 engine
trt.create_inference_graph(
    input_graph_def=None,
    outputs=None,
    max_batch_size=1,
    input_saved_model_dir=SAVED_MODEL_DIR,
    output_saved_model_dir=FP32_SAVED_MODEL_DIR,
    precision_mode="FP32")

benchmark_saved_model(FP32_SAVED_MODEL_DIR, BATCH_SIZE=1)

and i have set: import os os.environ["CUDA_DEVICE_ORDER"]="PCI_BUS_ID" os.environ["CUDA_VISIBLE_DEVICES"]="0"

when i run ,i got an error: InvalidArgumentError: Failed to import metagraph, check error log for more info

and then i add a code: tf.keras.backend.set_learning_phase(0) the error is gone ,but one error rasie: UnavailableError: Can't provision more than one single cluster at a time

emmm....... i just use one GPU,which is RTX 2080ti

cuda: Cuda compilation tools, release 10.0, V10.0.130

SOMEONE HELP ME, PLEASE!

Sep 29 '19 09:09 leo-XUKANG

@leo-XUKANG for the message InvalidArgumentError: Failed to import metagraph, check error log for more info could you share the error log?

Oct 10 '19 19:10 aaroey

I'm facing the same issue, sample code is here: https://gist.github.com/zyenge/2595f3369e7e6128dcc79b1a30c3e3cd I tried both frozen model and SavedModel, neither works

Jan 03 '20 23:01 zyenge

@pooyadavoodi have you encountered similar issue before? Also @bixia1

Jan 04 '20 05:01 aaroey

Hey guys, is there any fix to this please?

Jan 25 '20 13:01 SirPhemmiey

@sanjoy @bixia1 could you help to investigate this?

Jan 27 '20 15:01 aaroey

I think the issue was the number of GPU memory fraction i allocated

Jan 27 '20 18:01 SirPhemmiey

Any update on this?

Jun 17 '20 15:06 BernardinD

My issue was fixed by fixing the output node names. I mistakenly used the output tensor names of another graph. I'd double check and see if you still have issues when setting outputs to something besides None.

Jul 07 '20 16:07 BernardinD

For: Can't provision more than one single cluster at a time

I believe this is caused as the graph is preloaded and havent successfully convert. Therefore, when you use jupyter to rerun, the GPU mem is not released. You should check the graph again to verify whether the outputs are correct. Every time convert fails, restart the jupyter kernel.

Jun 14 '21 07:06 dtlam26

For: Failed to import metagraph, check error log for more info If you use jupyter notebook, pls check the result print in the terminal console. There will be a hint which node you are typing incorrect name. I suggest you should check tensorboard for the whole graph to graph correct name for the outputs

Jun 14 '21 07:06 dtlam26

tensorrt tensorrt copied to clipboard

UnavailableError: Can't provision more than one single cluster at a time

tensorrt
tensorrt copied to clipboard