transformer-deploy
transformer-deploy copied to clipboard
t5 notebook broken with transformer-deploy 0.5.0
Running the t5.ipynb notebook is broken when using the transformers-deploy 0.5.0 docker container. Specifically, with
def get_random_input_encoder() -> Dict[str, torch.Tensor]:
max_seq = 128
seq_len = random.randint(a=1, b=max_seq)
batch = max_seq // seq_len
random_input_ids = torch.randint(
low=0, high=tokenizer.vocab_size, size=(batch, seq_len), dtype=torch.int32, device="cuda"
)
inputs = {"input_ids": random_input_ids}
return inputs
keep_fp32_encoder = get_keep_fp32_nodes(onnx_model_path=encoder_model_path, get_input=get_random_input_encoder)
assert len(keep_fp32_encoder) > 0
enc_model_onnx = convert_fp16(onnx_model=encoder_model_path, nodes_to_exclude=keep_fp32_encoder)
save_onnx(proto=enc_model_onnx, model_path=encoder_fp16_model_path, clean=False)
del enc_model_onnx
torch.cuda.empty_cache()
gc.collect()
I get
InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Deserialize tensor onnx::MatMul_2637 failed.corrupted protobuf data: tensor shape size(4194304) does not match the data size(0) in proto
@pommedeterresautee
Anyone have a workaround here?
Running an export of encoder(T5-3b) without mixed precision also seems to give similar a similar error as the following
Traceback (most recent call last):
...
encoder_onnx = create_model_for_provider(encoder_onnx_path_to_compare, "CUDAExecutionProvider", log_severity=3)
File "/workspace/transformer-deploy/src/transformer_deploy/backends/ort_utils.py", line 85, in create_model_for_provider
return InferenceSession(path, options, providers=provider_to_use)
File "/home/af/.local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 347, in __init__
self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/af/.local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 395, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Deserialize tensor embed_tokens.weight failed.corrupted protobuf data: tensor shape size(32876544) does not match the data size(0) in proto
I was using torch==1.12.1+cu116
, onnx==1.12.0
, onnxruntime-gpu==1.12.1
running on the same nvidia driver(515.65.01
and CUDA 11.7
) as the notebook. Could we know the versions of libraries that notebook was successfully running?
I temporarily solved the issue by using the 8bfe4f58a4cbdf84348a37838ba61c980bc6c101 commit of transformer-deploy and using PyTorch 1.11 as well as onnx==1.12.0
and onnxruntime==1.12.0
.