transformer-deploy icon indicating copy to clipboard operation
transformer-deploy copied to clipboard

t5 notebook broken with transformer-deploy 0.5.0

Open michaelroyzen opened this issue 1 year ago • 3 comments

Running the t5.ipynb notebook is broken when using the transformers-deploy 0.5.0 docker container. Specifically, with

def get_random_input_encoder() -> Dict[str, torch.Tensor]:
    max_seq = 128
    seq_len = random.randint(a=1, b=max_seq)
    batch = max_seq // seq_len
    random_input_ids = torch.randint(
        low=0, high=tokenizer.vocab_size, size=(batch, seq_len), dtype=torch.int32, device="cuda"
    )
    inputs = {"input_ids": random_input_ids}
    return inputs


keep_fp32_encoder = get_keep_fp32_nodes(onnx_model_path=encoder_model_path, get_input=get_random_input_encoder)
assert len(keep_fp32_encoder) > 0
enc_model_onnx = convert_fp16(onnx_model=encoder_model_path, nodes_to_exclude=keep_fp32_encoder)
save_onnx(proto=enc_model_onnx, model_path=encoder_fp16_model_path, clean=False)

del enc_model_onnx
torch.cuda.empty_cache()
gc.collect()

I get

InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Deserialize tensor onnx::MatMul_2637 failed.corrupted protobuf data: tensor shape size(4194304) does not match the data size(0) in proto

@pommedeterresautee

michaelroyzen avatar Aug 20 '22 21:08 michaelroyzen

Anyone have a workaround here?

david-rx avatar Aug 29 '22 01:08 david-rx

Running an export of encoder(T5-3b) without mixed precision also seems to give similar a similar error as the following

Traceback (most recent call last):
  ...
    encoder_onnx = create_model_for_provider(encoder_onnx_path_to_compare, "CUDAExecutionProvider", log_severity=3)
  File "/workspace/transformer-deploy/src/transformer_deploy/backends/ort_utils.py", line 85, in create_model_for_provider
    return InferenceSession(path, options, providers=provider_to_use)
  File "/home/af/.local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 347, in __init__
    self._create_inference_session(providers, provider_options, disabled_optimizers)
  File "/home/af/.local/lib/python3.8/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 395, in _create_inference_session
    sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Deserialize tensor embed_tokens.weight failed.corrupted protobuf data: tensor shape size(32876544) does not match the data size(0) in proto

I was using torch==1.12.1+cu116, onnx==1.12.0, onnxruntime-gpu==1.12.1 running on the same nvidia driver(515.65.01 and CUDA 11.7) as the notebook. Could we know the versions of libraries that notebook was successfully running?

caffeinetoomuch avatar Sep 02 '22 01:09 caffeinetoomuch

I temporarily solved the issue by using the 8bfe4f58a4cbdf84348a37838ba61c980bc6c101 commit of transformer-deploy and using PyTorch 1.11 as well as onnx==1.12.0 and onnxruntime==1.12.0.

michaelroyzen avatar Sep 04 '22 06:09 michaelroyzen