TensorRT
TensorRT copied to clipboard
undefined symbol error during onnx export when using pytorch-quantization
I create and use a custom image based on nvidia's cuda-runtime docker images that is used on a K8s platform to fine-tune a llm and then convert it to onnx.
Recently, I wanted to update the image to the latest libraries and after solving a few conflicts, the error below is the only I'm unable to understand and solve.
The GPU is always an A5000/A6000.
Cuda driver: NVIDIA-SMI 525.125.06 Driver Version: 525.125.06 CUDA Version: 12.0
Pytorch is the latest stable 2.1 so it should support Cuda 12.0.
I tried it with CUDA 12.2 based images as well and get the same error.
I have also tried to install pytorch-quantization from source and still get the same error.
Do I need to downgrade cuda and/or torch ?
The older images use torch 1.13, cuda 11.7.1, and, pytorch-quantization 2.1.2 and they work perfectly fine in the same environment.
The error:
ONNX export starting
Failed: ONNX export
Traceback (most recent call last):
File \"/app/tfdeployutils/pytorch_utils.py\", line 107, in convert_to_onnx
from pytorch_quantization.nn import TensorQuantizer
File \"/usr/local/lib/python3.10/dist-packages/pytorch_quantization/nn/__init__.py\", line 19, in <module>
from pytorch_quantization.nn.modules.tensor_quantizer import *
File \"/usr/local/lib/python3.10/dist-packages/pytorch_quantization/nn/modules/tensor_quantizer.py\", line 26, in <module>
from pytorch_quantization.tensor_quant import QuantDescriptor, tensor_quant, fake_tensor_quant
File \"/usr/local/lib/python3.10/dist-packages/pytorch_quantization/tensor_quant.py\", line 28, in <module>
from pytorch_quantization import cuda_ext
ImportError: /usr/local/lib/python3.10/dist-packages/pytorch_quantization/cuda_ext.cpython-310-x86_64-linux-gnu.so: undefined symbol: _ZN3c106detail14torchCheckFailEPKcS2_jRKNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEE
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File \"/app/optimize.py\", line 91, in main
create_onnx(output_dir)
File \"/app/optimize.py\", line 53, in create_onnx
convert_to_onnx(
File \"/app/tfdeployutils/pytorch_utils.py\", line 109, in convert_to_onnx
raise ImportError(
ImportError: It seems that pytorch-quantization is not yet installed. It is required when you enable the quantization flag and use CUDA device.Please find installation instructions on https://github.com/NVIDIA/TensorRT/tree/main/tools/pytorch-quantization or use:
pip3 install git+ssh://[email protected]/NVIDIA/TensorRT#egg=pytorch-quantization\\&subdirectory=tools/pytorch-quantization/
Failed: ONNX export
All done
Hi have you find the solution? I am also facing the same problem
@Aquos06 No I still haven't found any solution.
The error contains "torchCheckFail" so I assume it's some compatibility issues with pytorch and cuda but I don't understand why since the torch 2.1 is supposed to work with cuda 12.
Further, seeing as fine tuning and inference work just fine, I don't think it's an issue with pytorch.
The things I haven't tried:
- Use the latest nvidia pytorch docker image as a base
- Use the cuda 11.8 docker image as base
If you or anyone has tried the above or any other images and it still failed, please post here. I'll try them at some point and comment here.
I had the same error and I solved it by reinstall pytorch_quantization. I tried the two install methods which both don't work.
- pip install pytorch-quantization --extra-index-url https://pypi.ngc.nvidia.com
- cd tools/pytorch-quantization && python setup.py install
Here is the workable one:
- cd tools/pytorch-quantization && pip install .
Thanks @MengmengXiao, I will give it a try. Do you think the package on their pypi is broken ?
@accountForIssues Well, it's hard to say. But I do find some inconsistency between the pkg and the sample code on master doc (https://docs.nvidia.com/deeplearning/tensorrt/pytorch-quantization-toolkit/docs/index.html). Just take onnx exportation for example, I got a serious of errors when I run the sample code, and it seems like the pkg had updated related APIs. Finally, the exportation process was successfully completed by referring to https://github.com/NVIDIA/TensorRT/issues/3219#issue-1851352797
Could you please provide a reproduce for us? Maybe something is broken. cc @pranavm-nvidia
@zerollzeng I'll see if I can create a small example.
But I can share the relevant code. The exception is thrown here, in this function - https://github.com/ELS-RD/transformer-deploy/blob/main/src/transformer_deploy/backends/pytorch_utils.py#L137
That function is being called in the code as below:
from transformers import GPT2TokenizerFast, GPT2LMHeadModel
from pytorch_utils import convert_to_onnx
tokenizer = GPT2TokenizerFast.from_pretrained("gpt2-medium")
input_ids = tokenizer(
"a prompt to pass through the model",
add_special_tokens=True,
return_attention_mask=False,
return_tensors="pt",
)
for k, v in input_ids.items(): # str, torch.Tensor
input_ids[k] = v.type(dtype=torch.int32)
convert_to_onnx(
model_pytorch=GPT2LMHeadModel.from_pretrained(path_to_pt_model),
output_path="./model.onnx",
output_names=["output"],
inputs_pytorch=dict(input_ids),
quantization=True,
var_output_seq=True,
)
I haven't tested it but you should be able use a standard pytorch model such as "gpt2-medium" in the place of path_to_pt_model to test it.
requirements.txt
accelerate==0.23.0
cryptography==41.0.4
datasets==2.14.5
evaluate==0.4.0
munch==4.0.0
onnx==1.14.1
onnxruntime-gpu==1.16.0
pydantic==1.10.13
python-jose==3.3.0
scikit-learn==1.3.1
torch==2.1.0
transformers==4.34.0
and then pytorch-quantization was installed in the first two ways as mentioned here - https://github.com/NVIDIA/TensorRT/issues/3381#issuecomment-1765914385
I had the same error and I solved it by reinstall pytorch_quantization. I tried the two install methods which both don't work.
- pip install pytorch-quantization --extra-index-url https://pypi.ngc.nvidia.com
- cd tools/pytorch-quantization && python setup.py install
Here is the workable one:
- cd tools/pytorch-quantization && pip install .
this is interesting, pip install . does anything different? by the way, i found the documentation is based on pytorch-quantization 2.2.0. how ever i am not able to install this version.
Should be _GLIBCXX_USE_CXX11_ABI={0,1} issue. See https://github.com/pytorch/pytorch/issues/13541
In my case, I solved thi siisue using previous version /pip install pytorch-quantization==2.1.3
Close this since there is WAR, also we have new quantization tool https://github.com/NVIDIA/TensorRT-Model-Optimizer, thanks all!