optimum
optimum copied to clipboard
`Invalid fd was supplied: -1` optimum.onnxruntime load onnx model failed
I used onnx package load my llm model directly successfully in my ubuntu x86 platform.
model = onnx.load("Qwen_models/Qwen-7B-onnx/qwen_model.onnx") for node in model.graph.node: print(node.op_type) pass
but failed with optimum
,my code is as bellow:
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen-7B", trust_remote_code=True, pad_token="<|endoftext|>") model = ORTModelForQuestionAnswering.from_pretrained( model_id="L1-m1ng/qwen7b-inf", #"Qwen_models/Qwen-7B-onnx/", trust_remote_code=True )
The complete error message is as bellow:
Traceback (most recent call last):
File "/home/mingli/projects/Qwen/run_ort_optimize.py", line 20, in
my environment: Package Version Editable project location
absl-py 2.1.0 accelerate 0.26.1 aiohttp 3.9.3 aiosignal 1.3.1 antlr4-python3-runtime 4.9.3 async-timeout 4.0.3 attrs 23.2.0 auto_gptq 0.7.0 certifi 2024.2.2 chardet 5.2.0 charset-normalizer 3.3.2 click 8.1.7 cmake 3.25.0 colorama 0.4.6 coloredlogs 15.0.1 colorlog 6.8.2 contextlib2 21.6.0 contourpy 1.1.1 cycler 0.12.1 DataProperty 1.0.1 datasets 2.16.1 Deprecated 1.2.14 dill 0.3.7 einops 0.7.0 evaluate 0.4.1 filelock 3.13.1 flatbuffers 23.5.26 flatten-dict 0.4.2 fonttools 4.47.2 frozenlist 1.4.1 fsspec 2023.10.0 gekko 1.0.6 huggingface-hub 0.20.3 humanfriendly 10.0 hydra-colorlog 1.2.0 hydra-core 1.3.2 idna 3.6 importlib-resources 6.1.1 intel-extension-for-pytorch 2.2.0 Jinja2 3.1.3 joblib 1.3.2 jsonlines 4.0.0 kiwisolver 1.4.5 lit 15.0.7 lm_eval 0.4.0 /home/mingli/projects/Qwen/lm-evaluation-harness lxml 5.1.0 markdown-it-py 3.0.0 MarkupSafe 2.1.4 matplotlib 3.7.4 mbstrdecoder 1.1.3 mdurl 0.1.2 mpmath 1.3.0 multidict 6.0.5 multiprocess 0.70.15 networkx 3.1 neural-compressor 2.4.1 nltk 3.8.1 numexpr 2.8.6 numpy 1.24.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.19.3 nvidia-nvjitlink-cu12 12.3.101 nvidia-nvtx-cu12 12.1.105 omegaconf 2.3.0 oneccl-bind-pt 2.2.0+cpu onnx 1.15.0 onnxruntime 1.17.0 opencv-python-headless 4.9.0.80 optimum 1.17.1 optimum-benchmark 0.0.2 /home/mingli/projects/Qwen/optimum-benchmark optimum-intel 1.15.0.dev0 packaging 23.2 pandas 2.0.3 pathvalidate 3.2.0 peft 0.8.2 pillow 10.2.0 pip 24.0 portalocker 2.8.2 prettytable 3.9.0 protobuf 4.25.2 psutil 5.9.8 py-cpuinfo 9.0.0 py3nvml 0.2.7 pyarrow 15.0.0 pyarrow-hotfix 0.6 pybind11 2.11.1 pycocotools 2.0.7 Pygments 2.17.2 pyparsing 3.1.1 pyrsmi 1.0.2 pytablewriter 1.2.0 python-dateutil 2.8.2 pytz 2024.1 PyYAML 6.0.1 regex 2023.12.25 requests 2.31.0 responses 0.18.0 rich 13.7.0 rouge 1.0.1 rouge-score 0.1.2 sacrebleu 2.4.0 safetensors 0.4.2 schema 0.7.5 scikit-learn 1.3.2 scipy 1.10.1 sentencepiece 0.1.99 setuptools 68.2.2 six 1.16.0 sqlitedict 2.1.0 sympy 1.12 tabledata 1.3.3 tabulate 0.9.0 tcolorpy 0.1.4 threadpoolctl 3.2.0 tiktoken 0.5.2 tokenizers 0.15.1 torch 2.2.0+cu121 torchaudio 2.2.0+cu121 torchvision 0.17.0+cu121 tqdm 4.66.1 tqdm-multiprocess 0.0.11 transformers 4.35.2 transformers-stream-generator 0.0.4 triton 2.2.0 typepy 1.3.2 typing_extensions 4.9.0 tzdata 2023.4 urllib3 2.2.0 wcwidth 0.2.13 wheel 0.41.2 wrapt 1.16.0 xmltodict 0.13.0 xxhash 3.4.1 yarl 1.9.4 zipp 3.17.0 zstandard 0.22.0
Hi @L1-M1ng, how is Qwen_models/Qwen-7B-onnx/qwen_model.onnx
exported? Could you share the model on e.g. the HF Hub?
I am not surprised that this fails, as qwen is not natively supported in transformers & optimum.