TensorRT undefined symbol error during onnx export when using pytorch-quantization

I create and use a custom image based on nvidia's cuda-runtime docker images that is used on a K8s platform to fine-tune a llm and then convert it to onnx.

Recently, I wanted to update the image to the latest libraries and after solving a few conflicts, the error below is the only I'm unable to understand and solve.

The GPU is always an A5000/A6000.

Cuda driver: NVIDIA-SMI 525.125.06 Driver Version: 525.125.06 CUDA Version: 12.0

Pytorch is the latest stable 2.1 so it should support Cuda 12.0.

I tried it with CUDA 12.2 based images as well and get the same error.

I have also tried to install pytorch-quantization from source and still get the same error.

Do I need to downgrade cuda and/or torch ?

The older images use torch 1.13, cuda 11.7.1, and, pytorch-quantization 2.1.2 and they work perfectly fine in the same environment.

The error:

ONNX export starting
Failed: ONNX export
Traceback (most recent call last):
  File \"/app/tfdeployutils/pytorch_utils.py\", line 107, in convert_to_onnx
    from pytorch_quantization.nn import TensorQuantizer
  File \"/usr/local/lib/python3.10/dist-packages/pytorch_quantization/nn/__init__.py\", line 19, in <module>
    from pytorch_quantization.nn.modules.tensor_quantizer import *
  File \"/usr/local/lib/python3.10/dist-packages/pytorch_quantization/nn/modules/tensor_quantizer.py\", line 26, in <module>
    from pytorch_quantization.tensor_quant import QuantDescriptor, tensor_quant, fake_tensor_quant
  File \"/usr/local/lib/python3.10/dist-packages/pytorch_quantization/tensor_quant.py\", line 28, in <module>
    from pytorch_quantization import cuda_ext
ImportError: /usr/local/lib/python3.10/dist-packages/pytorch_quantization/cuda_ext.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c106detail14torchCheckFailEPKcS2_jRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File \"/app/optimize.py\", line 91, in main
    create_onnx(output_dir)
  File \"/app/optimize.py\", line 53, in create_onnx
    convert_to_onnx(
  File \"/app/tfdeployutils/pytorch_utils.py\", line 109, in convert_to_onnx
    raise ImportError(
ImportError: It seems that pytorch-quantization is not yet installed. It is required when you enable the quantization flag and use CUDA device.Please find installation instructions on https://github.com/NVIDIA/TensorRT/tree/main/tools/pytorch-quantization or use:
pip3 install git+ssh://[email protected]/NVIDIA/TensorRT#egg=pytorch-quantization\\&subdirectory=tools/pytorch-quantization/
Failed: ONNX export
All done

Oct 13 '23 08:10 accountForIssues

Hi have you find the solution? I am also facing the same problem

Oct 17 '23 03:10 Aquos06

@Aquos06 No I still haven't found any solution.

The error contains "torchCheckFail" so I assume it's some compatibility issues with pytorch and cuda but I don't understand why since the torch 2.1 is supposed to work with cuda 12.

Further, seeing as fine tuning and inference work just fine, I don't think it's an issue with pytorch.

The things I haven't tried:

Use the latest nvidia pytorch docker image as a base
Use the cuda 11.8 docker image as base

If you or anyone has tried the above or any other images and it still failed, please post here. I'll try them at some point and comment here.

Oct 17 '23 07:10 accountForIssues

I had the same error and I solved it by reinstall pytorch_quantization. I tried the two install methods which both don't work.

pip install pytorch-quantization --extra-index-url https://pypi.ngc.nvidia.com
cd tools/pytorch-quantization && python setup.py install

Here is the workable one:

cd tools/pytorch-quantization && pip install .

Oct 17 '23 08:10 MengmengXiao

Thanks @MengmengXiao, I will give it a try. Do you think the package on their pypi is broken ?

Oct 17 '23 08:10 accountForIssues

@accountForIssues Well, it's hard to say. But I do find some inconsistency between the pkg and the sample code on master doc (https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/index.html). Just take onnx exportation for example, I got a serious of errors when I run the sample code, and it seems like the pkg had updated related APIs. Finally, the exportation process was successfully completed by referring to https://github.com/NVIDIA/TensorRT/issues/3219#issue-1851352797

Oct 18 '23 03:10 MengmengXiao

Could you please provide a reproduce for us? Maybe something is broken. cc @pranavm-nvidia

Oct 18 '23 11:10 zerollzeng

@zerollzeng I'll see if I can create a small example.

But I can share the relevant code. The exception is thrown here, in this function - https://github.com/ELS-RD/transformer-deploy/blob/main/src/transformer_deploy/backends/pytorch_utils.py#L137

That function is being called in the code as below:

from transformers import GPT2TokenizerFast, GPT2LMHeadModel
from pytorch_utils import convert_to_onnx

tokenizer = GPT2TokenizerFast.from_pretrained("gpt2-medium")

input_ids = tokenizer(
        "a prompt to pass through the model",
        add_special_tokens=True,
        return_attention_mask=False,
        return_tensors="pt",
    )

for k, v in input_ids.items():  # str, torch.Tensor
    input_ids[k] = v.type(dtype=torch.int32)

convert_to_onnx(
        model_pytorch=GPT2LMHeadModel.from_pretrained(path_to_pt_model),
        output_path="./model.onnx",
        output_names=["output"],
        inputs_pytorch=dict(input_ids),
        quantization=True,
        var_output_seq=True,
    )

I haven't tested it but you should be able use a standard pytorch model such as "gpt2-medium" in the place of path_to_pt_model to test it.

requirements.txt

accelerate==0.23.0
cryptography==41.0.4
datasets==2.14.5
evaluate==0.4.0
munch==4.0.0
onnx==1.14.1
onnxruntime-gpu==1.16.0
pydantic==1.10.13
python-jose==3.3.0
scikit-learn==1.3.1
torch==2.1.0
transformers==4.34.0

and then pytorch-quantization was installed in the first two ways as mentioned here - https://github.com/NVIDIA/TensorRT/issues/3381#issuecomment-1765914385

Oct 18 '23 14:10 accountForIssues

I had the same error and I solved it by reinstall pytorch_quantization. I tried the two install methods which both don't work.

pip install pytorch-quantization --extra-index-url https://pypi.ngc.nvidia.com

cd tools/pytorch-quantization && python setup.py install

Here is the workable one:

cd tools/pytorch-quantization && pip install .

this is interesting, pip install . does anything different? by the way, i found the documentation is based on pytorch-quantization 2.2.0. how ever i am not able to install this version.

Dec 01 '23 08:12 ynma-hanvo

Should be _GLIBCXX_USE_CXX11_ABI={0,1} issue. See https://github.com/pytorch/pytorch/issues/13541

Dec 26 '23 03:12 cloudhan

In my case, I solved thi siisue using previous version /pip install pytorch-quantization==2.1.3

Feb 01 '24 14:02 soohyung-zhang

Close this since there is WAR, also we have new quantization tool https://github.com/NVIDIA/TensorRT-Model-Optimizer, thanks all!

May 14 '24 16:05 ttyio

TensorRT TensorRT copied to clipboard

undefined symbol error during onnx export when using pytorch-quantization

TensorRT
TensorRT copied to clipboard