Hi @Slyne, Thanks a lot for your work. I followed the demo in runtime/server/x86_gpu but got an error while starting the server. Could you offer some advice?

Log

== Triton Inference Server ==

NVIDIA Release 21.10 (build 28453983)

This container image and its contents are governed by the NVIDIA Deep Learning Container License. By pulling and using the container, you accept the terms and conditions of this license: https://developer.nvidia.com/ngc/nvidia-deep-learning-container-license

NOTE: Legacy NVIDIA Driver detected. Compatibility mode ENABLED.

mkdir: cannot create directory '/ws/model_repo/attention_rescoring/1': File exists I0104 07:43:00.874036 60 metrics.cc:298] Collecting metrics for GPU 0: Tesla V100-SXM2-16GB I0104 07:43:02.456442 60 libtorch.cc:1092] TRITONBACKEND_Initialize: pytorch I0104 07:43:02.456502 60 libtorch.cc:1102] Triton TRITONBACKEND API version: 1.6 I0104 07:43:02.456514 60 libtorch.cc:1108] 'pytorch' TRITONBACKEND API version: 1.6 2022-01-04 07:43:03.810494: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library libcudart.so.11.0 I0104 07:43:04.208671 60 tensorflow.cc:2170] TRITONBACKEND_Initialize: tensorflow I0104 07:43:04.208747 60 tensorflow.cc:2180] Triton TRITONBACKEND API version: 1.6 I0104 07:43:04.208773 60 tensorflow.cc:2186] 'tensorflow' TRITONBACKEND API version: 1.6 I0104 07:43:04.208786 60 tensorflow.cc:2210] backend configuration: {} I0104 07:43:04.261060 60 onnxruntime.cc:1999] TRITONBACKEND_Initialize: onnxruntime I0104 07:43:04.262061 60 onnxruntime.cc:2009] Triton TRITONBACKEND API version: 1.6 I0104 07:43:04.262548 60 onnxruntime.cc:2015] 'onnxruntime' TRITONBACKEND API version: 1.6 I0104 07:43:04.426943 60 openvino.cc:1193] TRITONBACKEND_Initialize: openvino I0104 07:43:04.427015 60 openvino.cc:1203] Triton TRITONBACKEND API version: 1.6 I0104 07:43:04.427036 60 openvino.cc:1209] 'openvino' TRITONBACKEND API version: 1.6 I0104 07:43:04.912607 60 pinned_memory_manager.cc:240] Pinned memory pool is created at '0x7efbbc000000' with size 268435456 I0104 07:43:04.922233 60 cuda_memory_manager.cc:105] CUDA memory pool is created on device 0 with size 67108864 I0104 07:43:04.945550 60 model_repository_manager.cc:1022] loading: decoder:1 I0104 07:43:05.046060 60 model_repository_manager.cc:1022] loading: feature_extractor:1 I0104 07:43:05.047111 60 onnxruntime.cc:2058] TRITONBACKEND_ModelInitialize: decoder (version 1) I0104 07:43:05.060228 60 onnxruntime.cc:2101] TRITONBACKEND_ModelInstanceInitialize: decoder_0_0 (GPU device 0) I0104 07:43:05.147370 60 model_repository_manager.cc:1022] loading: encoder:1 I0104 07:43:05.248009 60 model_repository_manager.cc:1022] loading: scoring:1 I0104 07:43:27.458409 60 onnxruntime.cc:2101] TRITONBACKEND_ModelInstanceInitialize: decoder_0_1 (GPU device 0) I0104 07:43:28.442748 60 onnxruntime.cc:2058] TRITONBACKEND_ModelInitialize: encoder (version 1) I0104 07:43:28.444939 60 model_repository_manager.cc:1183] successfully loaded 'decoder' version 1 I0104 07:43:28.445175 60 python.cc:1875] TRITONBACKEND_ModelInstanceInitialize: scoring_0_0 (CPU device 0) Initialized Rescoring! I0104 07:43:29.382147 60 python.cc:1875] TRITONBACKEND_ModelInstanceInitialize: feature_extractor_0_0 (GPU device 0) I0104 07:43:52.319210 60 onnxruntime.cc:2101] TRITONBACKEND_ModelInstanceInitialize: encoder_0_0 (GPU device 0) I0104 07:43:54.352376 60 onnxruntime.cc:2135] TRITONBACKEND_ModelInstanceFinalize: delete instance state I0104 07:43:54.352484 60 python.cc:1875] TRITONBACKEND_ModelInstanceInitialize: scoring_0_1 (CPU device 0) I0104 07:43:54.352766 60 onnxruntime.cc:2081] TRITONBACKEND_ModelFinalize: delete model state E0104 07:43:54.352834 60 model_repository_manager.cc:1186] failed to load 'encoder' version 1: Invalid argument: model 'encoder', tensor 'encoder_out': the model expects 3 dimensions (shape [-1,-1,512]) but the model configuration specifies 3 dimensions (an initial batch dimension because max_batch_size > 0 followed by the explicit tensor shape, making complete shape [-1,-1,-1]) Initialized Rescoring! ...

Jan 04 '22 08:01 yzmyyff

Resolved.

The wrong output shape is provided. It may be caused by convert.py

Jan 04 '22 10:01 yzmyyff

Resolved.

The wrong output shape is provided. It may be caused by convert.py

@yzmyyff Thank you. Could you provide the link to a train.yaml/config.yaml ? I found in some model, the last dim of encoder_out is -1, which is strange. It should be the true output size. And in your case, it should be 512. I had a comment here: https://github.com/wenet-e2e/wenet/blob/main/runtime/server/x86_gpu/scripts/convert.py#L82

Jan 04 '22 11:01 Slyne

Sure, here are these two files https://gist.github.com/yzmyyff/36fff2a3be9b57870034b0787e70f5ee

How about raising a warning if no shape is provided.

Jan 05 '22 02:01 yzmyyff

Sure, here are these two files https://gist.github.com/yzmyyff/36fff2a3be9b57870034b0787e70f5ee

How about raising a warning if no shape is provided.

It'll be better to extract the output shape from onnx model instead of the config file. I'll submit a pr to fix it.

Thank you again!

2022.01.19 update: @yzmyyff What's your pytorch version ? I filed an issue here.

Jan 05 '22 03:01 Slyne

This bug still exists in the current main branch how to fix it？

Apr 22 '22 16:04 raycool

This bug still exists in the current main branch how to fix it？

Please give it a try in Pytorch 1.9 or the latest Pytorch as mentioned: https://github.com/pytorch/pytorch/issues/71408.

Another quick fix is to change the config.pbtxt under the generated model_repo/encoder/config.pbtxt and set the shape manually.

@robin1001 Not sure if wenet source code has any dependency with the Pytorch version?

Apr 26 '22 15:04 Slyne

This issue has been automatically closed due to inactivity.

Feb 02 '24 01:02 github-actions[bot]

wenet
wenet copied to clipboard

Can't load model in offline GPU demo

Log

== Triton Inference Server ==

wenet wenet copied to clipboard

Can't load model in offline GPU demo

Log

== Triton Inference Server ==

wenet
wenet copied to clipboard